blog home

a little tool to keep multiple developers on the same page

Posted by William in backend on October 2nd, 2009

When you have a few developers working on the same site, you can easilyget into a very frustrating situation of they overwrite each other’s changes and the site keeps getting broken back and forth. This happens when, say, one developer puts in a fix, he releases the fix (some files) to the site, then later on, another developer who’s working on some other feature or fixes that release his files, and some of the files overlaps with the ones released by previous developers.

Now you can argue that you should use some sort of version control system(VCS) like svn, git, etc and implement some release process. But if you are talking about hetergenous developer profiles and experiences, not everyone uses VCS, and if you just have less than a handful of developers, you might not want to add too much overhead of implementing some kind of release process, which requires some training and authority of implementing it — and you might not have either.

So a much simpler approach would be just make sure before anyone puts code out there on the live site, make a diff between what’s on the live site code base and what you are going to push. If there is any diff that’s not from you (therefore from someone else), you need to merge that fix, or otherwise you unfix what the other developer fixes, or worse, breaks the site.

Now there is still one problem, you change might be big enough, and the other developer’s change might be just one line or two, so when you do a diff, you might very well miss the other developer’s change. So here’s an idea, I would set up a cron job on the webserver, and finds all the files that are modified in the last 24 hours (or whatever interval you like), and email the file list to the developers once a day. And the developer can also run this little command right before he pushes code, so that if he sees some files he’s going to push out is already in the last modified list (from the command), he will be extra careful, and make sure to resolve all the difference.

The command is rather simple:

cd /path/to/webroot; find . -mtime 1 -print | mail -s “files modified in last 24 hours” developer1@mysite.com developer2@mysite.com

Sometimes there are some cache files that keeps changing but irrelevant to the site codebase, you can just use “grep -v” to filter out those files, for example:

cd /path/to/webroot; find . -mtime 1 -print | grep -v ‘/caches/’ | mail -s “files modified in last 24 hours” developer1@mysite.com developer2@mysite.com

this will filter out any files/directories that has ‘/caches/’ in its path.

Optimizing PHP

Posted by William in backend on June 10th, 2009

An good article providing many insights on optimizing PHP.

Be careful of using PHP arrays.

PHP arrays is not efficient and uses a lot of memory. So be very careful to use PHP arrays, especially if the array could potentially have a lot of elements. Here’s an example:

<?php
$arr = array();
for ( $i = 0; $i < 10000; $i ++ )
$arr[] = $i;

print “Memory used: ” . number_format(memory_get_usage(true)) . “\n”;
?>

This is an array of only 10,000 elements, but this consumes over 2MB of memory. Imagine from your webserver, you have 10,000 concurrent users invoking a script that’s doing something like this, you can easily crash the machine by running out of memory.

The lesson here is that PHP array is only designed to be used for small amount of data, ideally not more than a few hundred items.

get date string using linux unix shell command

Posted by William in backend on May 17th, 2009

Sometimes you need to use get the date string as part of your shell command. For example, you want to do a daily dump of your MySQL database as backup. You can do that with a cron job. You want the date string as the file name it dumps to. You can do something like as your cron job:

mysqldump -hlocalhost -uroot mydb > /home/john/db_backup/mydb-`date +%Y%m%d`.sql

There are also times you want to do something for yesterday’s activity. For example, you have a daily log of some user activity, and you would like to run some script in early morning every day to process the yesterday’s log so that by morning you have the reports for yesterday. So here you need to get the date string for yesterday.

In this case, you can use the –date option together with +format. For example:

yesterday=`date –date=”1 days ago” +%Y%m%d`; /home/john/my_report_script.sh -log /var/log/mylog-$yesterday.log

be consistent with php function return type

Posted by William in backend on April 20th, 2009

PHP is a type-less language, which gives you the flexibility of, for example, return anything from a function. This, however, can trigger hard-to-find bugs if you abuse it. A better choice would be consistent with return types, where I mean if the function is supposed to return boolean, only return boolean; if it’s supposed to return a string, always return a string; if it’s supposed to return an array, always return an array.

Here are some examples that shows how it can break your code if you do not pay attention to return types.

Example 1. Disaster: return boolean true and string:

function foo ()

{

if ( some_condition() )

return “ERROR_1″;

else if (some_other_condition() )

return “ERROR_2″;

else

return true;

}

So the author of this function is trying to return some error code (in string, bad idea!), or true if everything is good. But the problem is that any string will be evaluated to be true. So if you have a caller like this:

$rtn = foo();

if ($rtn == true)

// foo is good.

else if ( $rtn == “ERROR_1″ )

// handle ERROR_1

You will have a bug here. The “else” section will never be executed, that’s because if you compare any string to boolean true with “==” operator, it is always true. You could use === to force a type check but your callers may not be aware of that. Some solution to this would be:

  • make the function return true or false only, on throw exceptions instead of returning error code
  • if you so intend to return error code, either through string, or use defined constant (a better idea), never return a boolean. For example, rather than return boolean true, return a constant RTN_OK

in other words, when  you design your function, if you decide what type it should return, always return that type.

Example 2: a function that returns an array.

For example, you have a function that tries to get some data from memcache. If no data from memcache found, get it from database, and cache it in memcache.

function get_customers( $bookid )

{

$mc_key = “bk-cust-$bookid”;

if  ( !$customers = $memcache->get( $mc_key ) )

{

// no records in memcache

$customers = array();

$result = mysql_query(”SELECT * FROM book_customers WHERE bookid=$bookid”);

while ($row = mysql_fetch_array($result))

$customers[] = $row;

$memcache->set( $mc_key, $customers ); //cache it.

}

return $customers;

}

The problem of this function is that if there is nothing found in the database initially, you will end up setting an empty array in the memcache, then when you do memcache->get, you will get it back as a boolean true instead.

synchronize clock with ntpdate

Posted by William in backend on March 18th, 2009

To get the correct time on your linux machine, try ntpdate command:

sudo ntpdate ntp.nasa.gov

A list of ntp servers on the internet:

server adress Location
ntp.ipv6.viagenie.qc.ca IPV6 ONLY
clock.via.net  
server fartein.ifi.uio.no Norway
server ntp.uio.no Norway
server ntp.eunet.no Norway
ntp.demon.co.uk UK
ntp.nasa.gov USA
bigben.cac.washington.edu USA
time-b.nist.gov USA
montpelier.ilan.caltech.edu USA
nist1.aol-ca.truetime.com USA
nist1.datum.com USA
time-a.timefreq.bldrdoc.gov USA
time-b.timefreq.bldrdoc.gov USA
time-c.timefreq.bldrdoc.gov USA
time.nist.gov USA
utcnist.colorado.edu USA
tick.usno.navy.mil USA
tock.usno.navy.mil USA
mizbeaver.udel.edu USA

You might want to setup this command as a daily cron job to adjust it if it tends to drift often.

Reference: http://linuxreviews.org/howtos/ntp/

simple example of using linux screen command

Posted by William in backend on February 25th, 2009

To start a named screen session:

screen -S myname

To detach from current screen:

ctrl-a d

To list screens:

screen -ls

To reattach to a screen:

screen -r screen_name

To reattch to a screen and detach the existing attached screen:

screen -dr screen_name

To create a window inside a screen

ctrl-a c

To go to next or previous window

ctrl-a n

ctrl-a p

Some reference:

http://www.soulcast.com/post/show/55079/An-introduction-to-the-linux-screen-command

Named pipes

Posted by William in backend on January 24th, 2009

Pipes are very useful feature on Linux/Unix. It allow separate process to communicate easily.

A simple example:

grep xyz log | wc -l

first process finds the lines that has “xyz” in file “log, the output feeds to the second process which count the number of lines. So this simple pipe counts the number of lines in log that has xyz.

This kind of pipe is called “unnamed pipe”.

There are also “named” pipes, or FIFO. A named pipe is a file within the filesystem. A named pipe can be created by mkfifo (or mknod on older systems).

mkdir my_fifo

ls -l my_fifo

prw-r–r–  1   dexin users 0 2009-03-04 22:04 my_fifo

A simple use of named pipe: in two separate terminals,

ls -l > my_fifo

cat < pipe

you will see the output from the first command gets displayed on the second terminal. The order in which you run the command does not matter.

A more interesting example of using named pipe could be you have a log file that’s being updated periodically, and you want to parse the new lines in the log. You can have one process that tail -F on the log and redirect to a pipe, and another process takes the output from the pipe:

tail -F log > my_pipe

parser < my_pipe