Cuma sedikit hubungan nya sama code di bawah... but... Let's say... gue tulis code begini neh...
String a = "Adelwin" + "Handoyo" + "programmer" + "java"; Lelet ato kagak? I mean relative to kalo make string buffer... Leletan mana? Jawab nya... sama lelet nya... ato sama cepet nya... Karna begitu jadi .class... ntar jdk nya bisa optimize lagi... diganti jadi make string buffer ntar... Maksudnya apa? Maksud nya... optimization nya jdk ituh jauh sekali ... dan mayan canggih... G pernah nemu case nya... gue split string pake String.split()... waktu di compile... die ganti jadi string tokenizer... I'm not saying that I /will/ happen on all cases... Adelwin Handoyo | Senior Consultant - Wholesale Bank Standard Chartered Bank 7, Changi Business Park Cresent, Level 3. Singapore (486028) T : (65) 659 61395 | E adelwin.adel...@sc.com ________________________________ From: jug-indonesia@yahoogroups.com [mailto:jug-indone...@yahoogroups.com] On Behalf Of Andrian Kurniady Sent: Wednesday, May 19, 2010 3:42 PM To: jug-indonesia@yahoogroups.com Subject: Re: RE: [JUG-Indonesia] menggunakan StringTokenizer instead of split Kena java heap space soalnya semua tokennya disimpen ke dalem List<String[]> strings = new ArrayList<String[]>(); --> ini isinya nanti sama besar sama filenya (minus delimiter) + overhead String object. Kayaknya bedanya tuh tokenizer generate stringnya on the fly. Tiap request satu token dia kluarin satu string. Jadi gak semuanya dimasukin ke dalam array. Kalau buat proses file yang panjang, jadi perlu memorinya gak segede filenya, karena tiap token bisa diproses serial trus dibuang kalo udah gak dipake. Gedein aja heap sizenya hehe... -Kurniady 2010/5/19 ifnu <ifnub...@gmail.com <mailto:ifnub...@gmail.com> > Akhirnya coba-coba riset juga menggunakan teknik microbenchmarking yang tidak valid. Testnya membaca file csv dan me load ke dalam Struktur data matriks 2 dimensi. ini kodenya public class StringTest { public static void main(String[] args) throws IOException, FileNotFoundException { BufferedWriter writer = new BufferedWriter(new FileWriter("c:/data.dat")); for(int i=0; i<1000;i++){ for(int x=0; x<1000;x++){ writer.write("x,"); } writer.write("\n"); if(i % 100 == 0){ writer.flush(); } } writer.flush(); writer.close(); split(); tokenizer(); } public static void split() throws FileNotFoundException, IOException { long start = System.currentTimeMillis(); BufferedReader reader = new BufferedReader(new FileReader("c:/data.dat")); String line = null; List<String[]> strings = new ArrayList<String[]>(); while ((line = reader.readLine()) != null) { String[] split = line.split(","); strings.add(split); } long end = System.currentTimeMillis(); System.out.println("elapsed time " + (end - start)); } public static void tokenizer() throws FileNotFoundException, IOException { long start = System.currentTimeMillis(); BufferedReader reader = new BufferedReader(new FileReader("c:/data.dat")); String line = null; List<List<String>> strings = new ArrayList<List<String>>(); while ((line = reader.readLine()) != null) { StringTokenizer tokenizer = new StringTokenizer(line); List<String> tokens = new ArrayList<String>(); while(tokenizer.hasMoreTokens()){ tokens.add(tokenizer.nextToken()); } strings.add(tokens); } long end = System.currentTimeMillis(); System.out.println("elapsed time " + (end - start)); } } Hasilnya : Split elapsed time 1375 tokenizer elapsed time 187 Tokenizer nyaris 10x lebih kenceng kalau datanya diperbesar, ok ok ada yang bilang fungsi split dipanggil dulu baru tokenizer, gimana kalau dibalik? tokenizer elapsed time 125 Split elapsed time 1500 Sepertinya argumen saya masih lumayan valid berdasarkan microbenchmarking ini. Nah pertanyaan lebih lanjut, kalau data file CSV-nya sangat besar, misalnya 1gb, 2 fungsi ini sama-sama kena java heap space, tapi lebih karena saya pake bufferedreader, ada yang tertantang memperbaiki kodenya agar 2 buah cara ini bisa ditest terhadap data yang sangat besar? onyone? ;) ______________________________________ Sent from my www.pageflakes.com <http://www.pageflakes.com> startpage This email and any attachments are confidential and may also be privileged. If you are not the addressee, do not disclose, copy, circulate or in any other way use or rely on the information contained in this email or any attachments. If received in error, notify the sender immediately and delete this email and any attachments from your system. Emails cannot be guaranteed to be secure or error free as the message and any attachments could be intercepted, corrupted, lost, delayed, incomplete or amended. Standard Chartered PLC and its subsidiaries do not accept liability for damage caused by this email or any attachments and may monitor email traffic. Standard Chartered PLC is incorporated in England with limited liability under company number 966425 and has its registered office at 1 Aldermanbury Square, London, EC2V 7SB. Standard Chartered Bank ("SCB") is incorporated in England with limited liability by Royal Charter 1853, under reference ZC18. The Principal Office of SCB is situated in England at 1 Aldermanbury Square, London EC2V 7SB. In the United Kingdom, SCB is authorised and regulated by the Financial Services Authority under FSA register number 114276. If you are receiving this email from SCB outside the UK, please click http://www.standardchartered.com/global/email_disclaimer.html to refer to the information on other jurisdictions.