Publishing dealnews.com every day takes a great content staff and (in my opinion) great developers. You get to see the content side every day on the front page. But, some people want to know what us developers are up to. This is a collection of papers, code we have written, and tools we use.
Open Source Software
We use PHP everywhere in our stack. For us, it makes sense because we have hired a great staff of PHP developers. So, we leverage that talent by using PHP everywhere we can.
One place where people seem to stumble with PHP is with long running PHP processes or parallel processing. The pcntl extension gives you the ability to fork PHP processes and run lots of children like many other unix daemons might. We use this for various things. Most notably, we use it run Gearman worker processes. While at the OReilly Open Sourc Convention in 2009, we were asked about how we pulled this off. So, we are releasing the two scripts that handle the forking and some instructions on how we use them.
The first script, prefork.php, is for forking a given function from a given file and running n children that will execute that function. There can be a startup function that is run before any forking begins and a shutdown function to run when all the children have died.
The second script, prefork_class.php, uses a class with defined methods instead of relying on the command line for function names. This script has the added benefit of having functions that can be run just before each fork and after each fork. This allows the parent process to farm work out to each child by changing the variables that will be present when the child starts up. This is the script we use for managing our Gearman workers. We have a class that controls how many workers are started and what functions they provide. I may release a generic class that does that soon. Right now it is tied to our code library structure pretty tightly.
We have also included two examples. They are simple, but do work to show you how the scripts work.prefork_php.tar.gz
MemProxy is a simple (bet very powerful) PHP script that proxies web requests and stores the contents in memcached. By being a full proxy, it allows the proxy servers to avoid heavy application level code. This makes it very fast and efficient. It also allows for a very simple set up on dedicated proxy servers.
MySQL Replication Log Purging
MySQL master servers won't remove their logs automatically when slaves are done with them. You can set expire_logs_days to remove them after a certain number of days. But, you are not assured that a slave is done with the logs. So, we wrote a script to connect to slaves and then purge logs on the master servers. It works for us. Your mileage may vary.flush_mysql_master.php.gz
MySQL Database Backup
This is not really exciting, but it works quite well. Backing up our database was causing problems for us. If you run mysqldump for an entire database, it locks all the tables in the database. This is bad as our dumps were taking almost an hour to finish. So, we wrote this script to dump one table at a time. Each table goes to its own file and then they are all tar.gz'd together. You can create more than one config and run each one at different times. We have one for daily and one for hourly. The config includes the ability to skip tables.
Updated to 0.2. Added some more options to the mysqldump commanddbbackup-0.2.tar.gz
A host template and related script to graph memcached network traffic, connections, requests per second and cache hits and misses.
WritingsSenior Developer Brian Moon's blog can be found at http://doughboy.wordpress.com
Memcache Client API performance (version 2)
After writing up my findings in my last memcached test, I posted a link to it on the memcached mailing list. Some users had some useful questions and suggestions. So, I basically started from scratch and ran the tests again.
PHP and MySQL, the future
The tried and true MySQL extension (mysql) has been around in PHP for years. Lately, there has been a lot of buzz in the PHP world about the new "MySQL Improved Extension" (mysqli) and "PHP Data Objects" (PDO). When I work on Phorum or for dealnews.com, I am always concerned with performance. So, I thought I would give these new methods a try.
Memcache Client API performance
We are working on a new application here at dealnews (more on it soon) and decided to open our minds up to make sure we built the fastest application possible. So, I took a look at PHP (our standard language), Perl and Python. The application will be mostly working with memcached, so, that was the focus of my tests.
Caching Dynamic Web Content to Increase Dependability and Performance
As more and more Web sites discover the advantages of storing content on a database server, dependability problems may be introduced that can be minimized through caching. Caching decreases the number of repetitive queries for unchanging data and increases the resources available for other queries, particularly complex queries like searching.