Fine Tuning Java Code - String Addition
Summary
String types are one of the most used in software. However, as they are immutable objects, they may impair performance through inappropriate use in string manipulation.
String addition
Tip 4 (string concatenation)
String types are one of the most used in software. However, as they are immutable objects, they may impair performance through inappropriate use in string manipulation.
Although use of the '+' operator for performing string concatenation make coding easier, quicker and more readable, it should be noted that this is both time and memory intensive.
When string manipulation is required and performance is also an issue, consider using StringBuffer instead of String.
There is NO support for String manipulation at the JVM level. All the string semantic provided by the Java language are performed by the compiler.
Consider what the compiler does in the following examples;
String s1 = "Hello " ;
String s2 = "world"
String s = s1 + s2;
Basically the idea is to allocate a StringBuffer on the operand stack, and then make use of append methods to store the strings into it. Finally the whole string is retrieved using the toString.
Java compilers can achieve this task in a number of ways;
BB 0004 // 6 : new java/lang/StringBuffer
59 // 9 : dup
B7 0005 // A : invokespecial java/lang/StringBuffer.<init>()V
2B // D : aload_1
B6 0006 // E : invokevirtual java/lang/StringBuffer.append
(Ljava/lang/String;)Ljava/lang/StringBuffer;
2C // 11 : aload_2
B6 0006 // 12 : invokevirtual java/lang/StringBuffer.append
(Ljava/lang/String;)Ljava/lang/StringBuffer;
B6 0007 // 15 : invokevirtual java/lang/StringBuffer.toString()
Ljava/lang/String;
4E // 18 : astore_3
A smarter compiler will generate a smaller sequence of bytecodes for the same result.
BB 000F // 6 : new java/lang/StringBuffer
59 // 9 : dup
2B // A : aload_1
B7 0012 // B : invokespecial java/lang/StringBuffer.<init>
(Ljava/lang/String;)V
2C // E : aload_2
B6 0016 // F : invokevirtual java/lang/StringBuffer.append
(Ljava/lang/String;)Ljava/lang/StringBuffer;
B6 001A // 12 : invokevirtual java/lang/StringBuffer.toString()
Ljava/lang/String;
4E // 15 : astore_3
What about using the '+=' operator for String concatenation?
The operator += is available at the Java source language level. Of course, because there is no support at the JVM level, the compiler will provide the correct semantic using once again StringBuffer objects. As one can see from the resulting code, ' +=' is particularly inefficient on String objects!
Indeed, each time a '+=' is used a StringBuffer is allocated and the whole previous mechanism is done for both Strings (the one on the left and the one on the right).
String s = "";
s += "Hello ";
s += "world";
The associated generate code may look like the following;
BB 000D // 3 : new java/lang/StringBuffer
59 // 6 : dup
2B // 7 : aload_1
B7 0010 // 8 : invokespecial java/lang/StringBuffer.<init>
(Ljava/lang/String;)V
12 12 // B : ldc Hello
B6 0016 // D : invokevirtual java/lang/StringBuffer.append
(Ljava/lang/String;)Ljava/lang/StringBuffer;
B6 001A // 10 : invokevirtual java/lang/StringBuffer.toString()
Ljava/lang/String;
4C // 13 : astore_1
BB 000D // 14 : new java/lang/StringBuffer
59 // 17 : dup
2B // 18 : aload_1
B7 0010 // 19 : invokespecial java/lang/StringBuffer.<init>
(Ljava/lang/String;)V
12 1C // 1C : ldc world
B6 0016 // 1E : invokevirtual java/lang/StringBuffer.append
(Ljava/lang/String;)Ljava/lang/StringBuffer;
B6 001A // 21 : invokevirtual java/lang/StringBuffer.toString()
Ljava/lang/String;
4C // 24 : astore_1
BB 000D // 25 : new java/lang/StringBuffer
59 // 28 : dup
B7 001D // 29 : invokespecial java/lang/StringBuffer.<init>()V
4D // 2C : astore_2
2C // 2D : aload_2
12 1F // 2E : ldc Hello
B6 0016 // 30 : invokevirtual java/lang/StringBuffer.append
(Ljava/lang/String;)Ljava/lang/StringBuffer;
2C // 33 : aload_2
12 21 // 34 : ldc
B6 0016 // 36 : invokevirtual java/lang/StringBuffer.append
(Ljava/lang/String;)Ljava/lang/StringBuffer;
2C // 39 : aload_2
12 1C // 3A : ldc world
B6 0016 // 3C : invokevirtual java/lang/StringBuffer.append
(Ljava/lang/String;)Ljava/lang/StringBuffer;
2C // 3F : aload_2
B6 001A // 40 : invokevirtual java/lang/StringBuffer.toString()
Ljava/lang/String;
4E // 43 : astore_3
It is clear that the '+=' operator for String objects is both time and memory intensive due to the fact that each time it is used a new StringBuffer is created, and a toString method applied.
The optimal code is as follows;
StringBuffer stream = new StringBuffer();
....
stream.append("Hello");
stream.append(" ");
stream.append("world");
....
String s = stream.toString();
....
2C // 2D : aload_2
12 1F // 2E : ldc Hello
B6 0016 // 30 : invokevirtual java/lang/StringBuffer.append
(Ljava/lang/String;)Ljava/lang/StringBuffer;
2C // 33 : aload_2
12 21 // 34 : ldc
B6 0016 // 36 : invokevirtual java/lang/StringBuffer.append
(Ljava/lang/String;)Ljava/lang/StringBuffer;
2C // 39 : aload_2
12 1C // 3A : ldc world
B6 0016 // 3C : invokevirtual java/lang/StringBuffer.append
(Ljava/lang/String;)Ljava/lang/StringBuffer;
....
