Hi all,

I am new to the list and have been playing around with Riak over the last couple of months. I've come up on a client project to evaluate an alternative DB to their MS SQL server for a specific issue they are having.

I am having some big problems with Riak... so any guidance, help, advice, wisdom is much appreciated.

I have set up a 5 Node AWS EC2 m3.2xlarge (8vCPU & 30GB RAM) cluster behind a ELB.
I am using Riak v2.1.1 and using for the JavaVM: openjdk-7-jdk

I am initializing the cluster using 1.6million XML records in a 15Gb file. I am using PHP XMLreader to stream the file & simpleXML to parse individual 'records' (each record have about 120 fields; I am only using 16 for the Riak DB) Going through entire 15GB file in ~500 seconds for just parsing the data. (is there a bulk insert tool available?)

I am using riak-php-client to insert records into a map datatype. I have a custom schema - 1 field type="location_rpt" indexed="true" stored="true", 15 fields type either float, int, string and indexed="false" stored="true"

*THE PROBLEM*: After 250K records, inserts come to a slow crawl - going from 1 insert in 0.02 seconds to 1 insert in 1 second. Finally, it starts hitting java.lang.OutOfMemoryError: GC overhead limit exceeded errors!!

Some riak.conf settings...
storage_backend = leveldb
search = on
leveldb.maximum_memory.percent = 90
ring_size = 128
search.solr.jvm_options = -d64 -Xms1g -Xmx22g -XX:+UseStringCache -XX:+UseCompressedOops

Did you note the **-Xmx22g** Heap Size?!?!

I've searched Solr perfomance documents, Java VM Garbage Collection documents (should I try Java 8 with -XX:G1GC ??)

So prior to the java.lang.OutOfMemoryError: GC overhead limit exceeded, I start accruing the below errors...

2015-06-02 13:01:04.300 [error]
<0.1482.0>@yz_kv:index:219 failed to index object {{<<"activelistings">>,<<"listings">>}, <<"aa164f5cc4c50c61bea377b4a82a26dc">>} with error {"Failed to index docs",{error,req_timedout}}
because [{yz_solr,index,3,[{file,"src/yz_solr.erl"},{line,199}]},
{yz_kv,index,7,[{file,"src/yz_kv.erl"},{line,269}]},
{yz_kv,index,3,[{file,"src/yz_kv.erl"},{line,206}]},
{riak_kv_vnode,actual_put,6,[{file,"src/riak_kv_vnode.erl"},{line,1582}]},
{riak_kv_vnode,perform_put,3,[{file,"src/riak_kv_vnode.erl"},{line,1570}]},
{riak_kv_vnode,do_put,7,[{file,"src/riak_kv_vnode.erl"},{line,1361}]},
{riak_kv_vnode,handle_command,3,[{file,"src/riak_kv_vnode.erl"},{line,543}]},
{riak_core_vnode,vnode_command,3,[{file,"src/riak_core_vnode.erl"},{line,345}]}]

And then after buildup of the above, I get... the java.lang.OutOfMemoryError: GC overhead limit exceeded error:

2015-06-02 13:01:57.735 [info] <0.711.0>@yz_solr_proc:handle_info:135
solr stdout/err: at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.apache.lucene.store.RAMFile.newBuffer(RAMFile.java:75)
    at org.apache.lucene.store.RAMFile.addBuffer(RAMFile.java:48)
at org.apache.lucene.store.RAMOutputStream.switchCurrentBuffer(RAMOutputStream.java:152) at org.apache.lucene.store.RAMOutputStream.writeByte(RAMOutputStream.java:127)
    at org.apache.lucene.store.DataOutput.writeVInt(DataOutput.java:192)
at org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.encodeTerm(Lucene41PostingsWriter.java:572) at org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlock(BlockTreeTermsWriter.java:882) at org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:579)

Sincerely,


Robert Latko

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to