Hi all,
I am new to the list and have been playing around with Riak over the
last couple of months. I've come up on a client project to evaluate an
alternative DB to their MS SQL server for a specific issue they are having.
I am having some big problems with Riak... so any guidance, help,
advice, wisdom is much appreciated.
I have set up a 5 Node AWS EC2 m3.2xlarge (8vCPU & 30GB RAM) cluster
behind a ELB.
I am using Riak v2.1.1 and using for the JavaVM: openjdk-7-jdk
I am initializing the cluster using 1.6million XML records in a 15Gb
file. I am using PHP XMLreader to stream the file & simpleXML to parse
individual 'records' (each record have about 120 fields; I am only using
16 for the Riak DB) Going through entire 15GB file in ~500 seconds for
just parsing the data. (is there a bulk insert tool available?)
I am using riak-php-client to insert records into a map datatype. I have
a custom schema - 1 field type="location_rpt" indexed="true"
stored="true", 15 fields type either float, int, string and
indexed="false" stored="true"
*THE PROBLEM*: After 250K records, inserts come to a slow crawl - going
from 1 insert in 0.02 seconds to 1 insert in 1 second. Finally, it
starts hitting java.lang.OutOfMemoryError: GC overhead limit exceeded
errors!!
Some riak.conf settings...
storage_backend = leveldb
search = on
leveldb.maximum_memory.percent = 90
ring_size = 128
search.solr.jvm_options = -d64 -Xms1g -Xmx22g -XX:+UseStringCache
-XX:+UseCompressedOops
Did you note the **-Xmx22g** Heap Size?!?!
I've searched Solr perfomance documents, Java VM Garbage Collection
documents (should I try Java 8 with -XX:G1GC ??)
So prior to the java.lang.OutOfMemoryError: GC overhead limit exceeded,
I start accruing the below errors...
2015-06-02 13:01:04.300 [error]
<0.1482.0>@yz_kv:index:219 failed to index object
{{<<"activelistings">>,<<"listings">>},
<<"aa164f5cc4c50c61bea377b4a82a26dc">>} with error {"Failed to index
docs",{error,req_timedout}}
because [{yz_solr,index,3,[{file,"src/yz_solr.erl"},{line,199}]},
{yz_kv,index,7,[{file,"src/yz_kv.erl"},{line,269}]},
{yz_kv,index,3,[{file,"src/yz_kv.erl"},{line,206}]},
{riak_kv_vnode,actual_put,6,[{file,"src/riak_kv_vnode.erl"},{line,1582}]},
{riak_kv_vnode,perform_put,3,[{file,"src/riak_kv_vnode.erl"},{line,1570}]},
{riak_kv_vnode,do_put,7,[{file,"src/riak_kv_vnode.erl"},{line,1361}]},
{riak_kv_vnode,handle_command,3,[{file,"src/riak_kv_vnode.erl"},{line,543}]},
{riak_core_vnode,vnode_command,3,[{file,"src/riak_core_vnode.erl"},{line,345}]}]
And then after buildup of the above, I get... the
java.lang.OutOfMemoryError: GC overhead limit exceeded error:
2015-06-02 13:01:57.735 [info] <0.711.0>@yz_solr_proc:handle_info:135
solr stdout/err: at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.lucene.store.RAMFile.newBuffer(RAMFile.java:75)
at org.apache.lucene.store.RAMFile.addBuffer(RAMFile.java:48)
at
org.apache.lucene.store.RAMOutputStream.switchCurrentBuffer(RAMOutputStream.java:152)
at
org.apache.lucene.store.RAMOutputStream.writeByte(RAMOutputStream.java:127)
at org.apache.lucene.store.DataOutput.writeVInt(DataOutput.java:192)
at
org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.encodeTerm(Lucene41PostingsWriter.java:572)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlock(BlockTreeTermsWriter.java:882)
at
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:579)
Sincerely,
Robert Latko
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com