|
Sunday, 01 March 2009 18:27 |
|
There's a bit of debate over whether UUIDs(or GUIDs - for Microsoft software) are suitable to be used as primary keys for objects in a database. I'm not so much going to describe the debate, as there's no point in arguing, it comes down to a design decision. You can think of the pros and cons, and decide for yourself. ID Creation
One of the big concerns of using autoincremented fields from a database, is that if a lot of inserts are happening, especially distributed inserts, everyone is waiting on the database to generate a new id. Some large distributed architectures actually have dedicated servers whose only purpose is to give out ids to objects. The advantage of using autoincremented values is there's a guarantee of uniqueness. However, that guarantee comes at a high performance bottleneck, due to the mutual exclusive database access needed to generate the next id. This bottleneck cannot be resolved by parallelism, due to race conditions, there still is only one mutex lock, so everyone still waits in one line for an id. The UUID solution is instead to generate a unique id that is not guaranteed to be unique, but has a very high probability of being unique. From wikipedia: ".. after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%."
I'm pretty comfortable with those odds, I don't see myself generating more than a billion UUIDs anytime soon. Storage in MySQLI just generated a UUID: 1e8ef774-581c-102c-bcfe-f1ab81872213 A UUID like the one above is 36 characters long, including dashes. If you store this VARCHAR(36), you're going to decrease compare performance dramatically. This is your primary key, you don't want it to be slow. At its bit level, a UUID is 128 bits, which means it will fit into 16 bytes, note this is not very human readable, but it will keep storage low, and is only 4 times larger than a 32-bit int, or 2 times larger than a 64-bit int. I will use a VARBINARY(16) Theoretically, this can work without a lot of overhead. Use My Own UUID GeneratorI'm not really satisfied with the UUID generator for MySQL, they use a very old standard based on timestamps and MAC addresses(Version 1). This potentially can be a problem because a UUID can be traced to a particular machine, and a particular time. I'd rather have more random bits than include a machine signature, who really needs to track down their UUIDs to which machine generated them? I just want them to be unique, and don't need unneccessary bread crumbs lying around. Most likely I'd use Version 3,4, or 5. I'll be using my own UUID generator, which is a fantastic side-effect of UUIDs, you can generate the numbers however you want. You should be following the spec, or your results are not guaranteed. Create tablecreate table uuid_demo ( id VARBINARY(16), name VARCHAR(10) ); I created a simple table to illustrate usage. INSERTS, SELECTSI'm using the mysql console to demonstrate. Inserting the UUID goes like so: mysql> INSERT INTO uuid_demo SET id=0x1e8ef774581c102cbcfef1ab81872213, name="Kekoa"; Query OK, 1 row affected (0.00 sec) Notice that the id is not entered as a string, but as a large hex number. It is the same UUID as generated above, but without the dashes and with a 0x to indicate hex. Now to display what's in the table: mysql> SELECT * FROM uuid_demo; +------------------+-------+ | id | name | +------------------+-------+ | #####,##??" | Kekoa | +------------------+-------+ 1 row in set (0.00 sec) Well that's not very helpful, how do we retrieve the data? There's a couple of ways, your SQL driver will correctly receive the binary data into your field, then you'll have to parse it, however, you can use the HEX() function in mysql to quickly do the conversion for you: mysql> SELECT hex(id), name FROM uuid_demo; +----------------------------------+-------+ | hex(id) | name | +----------------------------------+-------+ | 1E8EF774581C102CBCFEF1AB81872213 | Kekoa | +----------------------------------+-------+ 1 row in set (0.00 sec) Ahh, much better. You can treat the result as a string and add the dashes or whatever you need if you want it to be pretty. If your language supports binary data well, you may consider retaining the binary data as is. So wherever in your queries you need to reference the id, do so using the hex value of the UUID, and when retrieving the rows make sure you remember that the id is binary, and handle it appropriately. |
|
Sunday, 01 March 2009 17:43 |
|
PHP has been good to me. I started using it over 7 years ago as a Perl refugee, and it really allowed me to get a strong understanding of programming in general because it was so easy to use. I started playing with PHP when PHP 4 came on the scene. I will admit that I probably didn't learn great design patterns, but things were simple, and there were basic objects. As I learned more about good programming through other languages, that was able to influence my PHP work. Java was probably the most influential in teaching good style. Unfortunately PHP was not originally designed for objects, but was strapped on there in PHP 4 in a sort of hodgepodge way. But with PHP 5 I really gained confidence that PHP was getting serious, as it's support for objects was much improved. Is PHP a child's language? It is if it is in a child's hands. I think that the great thing about PHP is that it doesn't limit you that much. If you want to make a terribly unorganized mess of code, that's your perogative, and PHP doesn't get in your way. However, if you want to be disciplined, and do things elegantly, it provides the language support for that as well. Anyone who criticizes PHP for being ugly, messy or anything else need only to say to the programmer that they are ugly and messy, because I have seen some very elegant PHP code, and have seen some utterly terrible code. Anyhow I don't think the usefulness of PHP can be discredited, as it is used in many major websites, such as facebook, digg, and others that escape me. It has proven to be a real performer when it comes to scalability. PHP was my "gateway drug", it allowed me to get immersed in web programming technology without being overwhelmed by anything; I just got things done. It also helped to provide experience that later could be built upon while maturing into a professional computer scientist. |
|
Tuesday, 24 February 2009 22:40 |
|
I've noticed that many people still want to do things the hard way. People don't want to use libraries to help them out. They construct Json code by hand crafting curly braces, and quoting their values. In PHP it is so simple to do Json, that really you have no excuse not to. You have a couple of options, the PHP extension, which comes with most PHP installations nowadays, or another library, most importantly, my preferred one, Zend_Json, which is part of the Zend Framework. Both libraries have two functions, encode and decode. Encode takes in an object and spits out the string that represents the Json. Decode is the opposite, feed in Json, and the function spits out a generic PHP object of type stdClass, that represents exactly the Json, (including translating Json arrays to PHP arrays). Luckily, PHP and Javascript are dynamic languages, so you can add properties to any object, including changing the types and whatnot, you don't need to specify a 'Class' in order to construct a message. To construct some Json in PHP(using the PHP extension), here's an example: $obj = new StdClass(); $obj->mythings = array(1, 2, "3", false); $obj->person = new stdClass(); $obj->person->name = "Kekoa"; $obj->person->age = 88;
echo json_encode($obj); Which prints out: {"mythings":[1,2,"3",false],"person":{"name":"Kekoa","age":88}} Wow. Exciting, it even takes care of quoting strings, and not quoting numbers, and booleans. Also takes care of any internal arrays or objects. Decoding Json is easier: $jsonText = '{"mythings":[1,2,"3",false],"person":{"name":"Kekoa","age":88}}';
$obj = json_decode($jsonText); print_r($obj);
When executed, the print_r reveals the PHP structure of $obj (the boolean false doesn't get printed nicely, but it's there, I promise): stdClass Object ( [mythings] => Array ( [0] => 1 [1] => 2 [2] => 3 [3] => )
[person] => stdClass Object ( [name] => Kekoa [age] => 88 )
) Since objects and associative arrays in PHP are essentially the same thing, you can request to have the decoder return your decoded Json into an associative array. Simply set the 2nd parameter to json_decode() to true. This is what you'll get: Array ( [mythings] => Array ( [0] => 1 [1] => 2 [2] => 3 [3] => )
[person] => Array ( [name] => Kekoa [age] => 88 )
) FYI, Zend_Json works almost exactly, but uses the static functions Zend_Json::encode() and Zend_Json::decode() instead. The built in command is definitely faster(written in C) but the Zend_Json decoder is much more flexible(forgiving) if you're reading Json from someone who say, handcrafted their own Json instead of using a library. Check this outdated, but still interesting comparison of Json libraries: http://gggeek.altervista.org/sw/article_20070425.html Do you need to use the whole Zend Framework to use Zend_Json-- no, it's a component framework, you can pull out the Json utilities without any external coupling. |
|
Tuesday, 10 February 2009 13:30 |
|
In my Scheme class, we went over how tail recursion allows for nice compiler optimizations. http://en.wikipedia.org/wiki/Tail_recursion - Great resource I decided to code up a quick experiment of how this can work. I have a simple recursive add function that will add up all the sequential integers up to and including num. This is also a proof of Pythagoras' tetrad. So if I do add(4) the return will be 1 + 2 + 3 + 4 = 10: add-recursive.c : int add(int num) { if(num == 0) return 0; return (num + add(num - 1)); }
int main(int argc, char** argv) { int count; if(argc > 1) { sscanf(argv[1], "%i", &count); printf("Result: %i\n", add(count)); } } Ok, let's try it out $ ./add-recursive 4 Result: 10 $ ./add-recursive 100 Result: 5050 $ ./add-recursive 100000 Result: 705082704 $ ./add-recursive 10000000 Segmentation fault Uh oh. Looks like we overflowed the stack at 10000000. The solution to such a silly problem would be to use tail recursion, which is a special form of recursion where your final operation is a recursive call. Normally, each recursive call will add a stack frame for each call to add(), but since you're only returning the result of the next recursive call, the compiler can re-use the same stack frame, and reload the recursive call onto the same stack frame, thus averting any stack overflow problems. Here's a modified function that will do the same thing, but allow for the tail recursion optimization. add-tail-recursive.c : int add(int num, int count) { if(num == 0) return count; count += num; return add(num - 1, count); }
int main(int argc, char** argv) { int count; if(argc > 1) { sscanf(argv[1], "%i", &count); printf("Result: %i\n", add(count, 0)); } }Here's the new results: $ ./add-tail-recursive 4 Result: 10 $ ./add-tail-recursive 100 Result: 5050 $ ./add-tail-recursive 100000 Result: 705082704 $ ./add-tail-recursive 10000000 Result: -2004260032
Hey, we got a result, our integer overflowed, but that's beside the point, we got an answer, which means the recursion ran to completion without a stack overflow. So, the obvious response is, that's a dumb idea to use recursion for adding like that, you should use a for loop. Yes, I know, I was just trying to test out this tail recursion in hopes that I will one day find a great use for it. Note that when you compile your C, you'll want to turn on optimizations, I used -O2 for gcc, without that option, it has the same effect of blowing the stack up. |
|
Monday, 12 January 2009 19:10 |
|
I just have to say the past few weeks have been so great as far as experimenting with different technologies. Over the Christmas break I have been playing with Android, and as school is starting I have found myself neck deep in Flash/Actionscript, Python, using Amazon Web Services, Scheme, and it is mildly crazy, but has been very fun. Lately, I have been able to detach from a single one-way thought process and have been forced to explore many different areas. I would say it has helped me to find my strengths and weaknesses more than when I was concentrating on only a couple avenues in computer science. Anyhow, I'm excited for the semester, I think it will be great to finish things up soon. So far I'm definately on top of things, and really enjoying it, but I'll let you know in a couple of months if I still feel that way. I have to say with no exposure to Actionscript until lately, that I like it as a language, it is what I felt Javascript should have been(mostly), but the integration with flash can be clunky I've found. I'm starting to become more open to python, perhaps I will get to know it even more this time around, I've tried to avoid it up to now, but I am considering jumping in head first. As for Scheme, it makes me feel like a retard, but I guess that's part of the process. I will sum up my experience so far as, "If you can do it in scheme, you can do it in anything else easier." |
|
<< Start < Prev 1 2 3 4 5 6 7 Next > End >>
Page 3 of 7
|