In php there are a few ways to get a few characters from the middle of a string. One is to use substr:
string substr ( string $string, int $start [, int $length ] )
another is by direct character reference:
string $chars = $string[0].$string[1];
When you only require a single character from a string, one would assume that direct character reference is quickest, and lots of characters are required, the overhead of a function call would be offset against the many string lookups.
One of the projects I am working on at the moment required the unpacking of fairly long strings, sometimes over 1024 bytes. Each byte dictated what the next few bytes are, and so every byte has to be analysed. Therefore I setup a simple test case to find out what the fastest way to do this was.
First, a simple timer function was needed. Here is the code I wrote a long time ago, and used many times.
function timer($stime=0,$btime=0){
$time = explode(' ', microtime());
$time = $time[1] + $time[0];
if($stime){
return round(($time - $stime - $btime)*1000,3);
}
return $time;
}
If you call this function with no arguments, it returns the current time in seconds, to many decimal places. If you then give this time value back to the function later, it will tell you the time difference in milliseconds
A single request for each of above methods would not suffice to compare them, and so each method is done many times within a for loop. This for loop has a fair amount of overhead, and so first the base time is calculated by running an empty for loop.
$num = 1000000;
$s = timer();
for($i=0;$i<$num;$i++){}
$base = timer($s);
print "Loop test took ".$base." milliseconds\n";
When executed, we find that the for loop takes around 225 milliseconds to complete 1 million iterations.
Now for the actual tests (this one is for four characters, starting from the third character):
$s = timer();
for($i=0;$i<$num;$i++){
$var = substr($input,2,5);
}
print "substr() test took ".(timer($s)-$base)." milliseconds\n";
$s = timer();
for($i=0;$i<$num;$i++){
$var = $input[2].$input[3].$input[4].$input[5].$input[6];
}
print "char ref test took ".(timer($s)-$base)." milliseconds\n";
This outputs something like:
Loop test took 230.988 milliseconds
substr() test took 728.633 milliseconds
char ref test took 1263.388 milliseconds
so we can see that for extracting four characters, using substr is faster. The same is not true for one and two characters, as the following results show (averages of four runs):
Number of chars substr() time/ms direct char time/ms
1 713 358
2 770 630
3 753 945
4 751 1280
5 733 1557
It can be seen that on my development system, using direct character reference is faster for two or less characters, and substr() is faster for three or more characters.