Riak Java client bug?

2013-09-25 Thread Guido Medina
Hi, Streaming 2i indexes is not timing out, even though the client is configured to timeout, this coincidentally is causing the writes to fail (or or the opposite?), is there anything elemental that could lock (I know the locking concept in Erlang is out of the equation so LevelDB?)

Re: Deleting data from LevelDB backend

2013-09-25 Thread Simon Effenberg
On Wed, 25 Sep 2013 09:15:33 -0400 Matthew Von-Maszewski matth...@basho.com wrote: - run Riak repair on the vnode so that leveldb can create a MANIFEST that matches the files remaining. what do you mean with this? Wait for AAE? Request each key to trigger read repair or do I miss something?

Work Shedding/Overload protection settings

2013-09-25 Thread Raghu Katti
Hello, I was wondering if there is any formula to determine these settings for overload protection and load shedding. The defaults are 5 and 1. I wanted to find if there are any recommended ways to determine custom values of these numbers for a given cluster configuration.

2i at large scale?

2013-09-25 Thread Wagner Camarao
Hi, I'm benchmarking 2i at scale of billion records, running one physical node locally with mostly default configs - except for LevelDB instead of Bitcask. Up to this point (14MM records in the bucket that's being indexed) it's still performing lookups well for my use case (read ~ 7ms using

Re: Riak Java client bug?

2013-09-25 Thread Brian Roach
Guido - When you say the client is configured to time out do you mean you're using PB and you set the SO_TIMEOUT on the socket via the PBClientConfig's withRequestTimeoutMillis()? - Roach On Wed, Sep 25, 2013 at 5:54 AM, Guido Medina guido.med...@temetra.com wrote: Hi, Streaming 2i indexes

Re: 2i at large scale?

2013-09-25 Thread Vincenzo Vitale
Basho people will know if this is normal or not, but keep in mind that in this way you are storing three copies of the data in the same machine, where you have all the default 64 vnodes of your ring. I think riak is designed for a cluster setup. Are you planning to run the same benchmark with

Re: Riak Java client bug?

2013-09-25 Thread Guido Medina
Like this: withConnectionTimeoutMillis(5000).build(); Guido. On 25/09/13 18:08, Brian Roach wrote: Guido - When you say the client is configured to time out do you mean you're using PB and you set the SO_TIMEOUT on the socket via the PBClientConfig's withRequestTimeoutMillis()? - Roach

Re: Riak Java client bug?

2013-09-25 Thread Guido Medina
To ease the situation we have done some redesign to avoid locks contention (Internal to our app), cause we were writing too often in very short time to the same key (no siblings), so it might be a combination of LevelDB+AAE+2i streaming, tomorrow hopefully things are back to normal, the

Re: 2i at large scale?

2013-09-25 Thread Reid Draper
Wagner, Are you using paginated 2i? Or asking for all of the results buffered at once? If the latter, I'd highly recommend trying to new paginated 2i that landed in Riak 1.4.0. Reid On Sep 25, 2013, at 10:50 AM, Wagner Camarao wag...@crunchbase.com wrote: Hi, I'm benchmarking 2i at scale

Re: Deleting data from LevelDB backend

2013-09-25 Thread Timo Gatsonides
Matthew, thanks very much for your feedback. I may bite the bullet and try your suggested last resort method on one of the nodes. I'll let you know how that goes. All nodes are 1.3 or newer. As to your question: my key is a bit of both and is quite long for typical Riak usage I guess. The

Re: Deleting data from LevelDB backend

2013-09-25 Thread Timo Gatsonides
Excellent news. Thanks very much! -Timo On 25 sep. 2013, at 21:44, Matthew Von-Maszewski wrote: Timo, Your email again brought up the topic of fixing this issue within Basho's leveldb. Previously there had always been a bigger problem and we did not worry about tombstone (delete)

Re: 2i at large scale?

2013-09-25 Thread Reid Draper
On Sep 25, 2013, at 3:39 PM, Wagner Camarao wag...@crunchbase.com wrote: Reid, Thanks for bringing that up and yes I'm using paginated 2i. * What size page are you using (ie. how many results per query)? * When you said the results would sometimes take three minutes, is that per page, or to

Re: 2i at large scale?

2013-09-25 Thread Wagner Camarao
What size page are you using (ie. how many results per query)? A: 10 When you said the results would sometimes take three minutes, is that per page, or to paginate through all of the results? A: I don't observe any difference by paging the first or any next page. That wait time is only when I

allow_mult vs. 2i

2013-09-25 Thread Brady Wetherington
I've built it a solid proof-of-concept system on leveldb, and use some 2i indexes in order to search for certain things - usually just for counts of things. I have two questions so far: First off, why is Bitcask the default? Is it just because it is faster? Or is it considered more 'stable' or

Re: allow_mult vs. 2i

2013-09-25 Thread Jeremiah Peschka
inline. --- Jeremiah Peschka - Founder, Brent Ozar Unlimited MCITP: SQL Server 2008, MVP Cloudera Certified Developer for Apache Hadoop On Wed, Sep 25, 2013 at 2:47 PM, Brady Wetherington br...@bespincorp.comwrote: I've built it a solid proof-of-concept system on leveldb, and use some 2i

Re: 2i at large scale?

2013-09-25 Thread Reid Draper
On Sep 25, 2013, at 4:35 PM, Wagner Camarao wag...@crunchbase.com wrote: What size page are you using (ie. how many results per query)? A: 10 If I understood correctly, you're paginating through a few million results total. If so, I'd try setting your page size much larger: try 1000.

Invoking Python script as part of a M/R job?

2013-09-25 Thread Jeffrey Eliasen
I would like to write a job that invokes a python script to execute the processing of each node in a bucket. I can't find a way to do this using Javascript and I don't really know Erlang well enough to make this work... is there a sample piece of code somewhere that demonstrates this? Thanks in

OSX Riak 1.4 Internal Server Error on all map reduce jobs

2013-09-25 Thread jeffrey k eliasen
I'm trying to run a map/reduce query and running into strange errors. I can't figure out how to determine the cause of the error I'm receiving. I'm basically getting this same error with any map/reduce query to any existing bucket. I found that the simplest query that demonstrates the problem

Riak listens on strange port

2013-09-25 Thread Jorge Sanchez
Hi, I can't find in the configuration this following port : tcp0 0 0.0.0.0:50588 0.0.0.0:* LISTEN 9337/beam.smp root@gruppuco:/usr/local/riak/etc# pwd /usr/local/riak/etc root@gruppuco:/usr/local/riak/etc# grep 50588 * Does anybody know what is this

Re: Riak listens on strange port

2013-09-25 Thread Jon Meredith
It's probably the port used for distributed erlang. epmd -names will confirm. jmeredith$ epmd -names epmd: up and running on port 4369 with data: name myerlnode at port 55607 More information here on how to configure. http://docs.basho.com/riak/latest/ops/advanced/security/ Jon On Wed, Sep

Re: Invoking Python script as part of a M/R job?

2013-09-25 Thread jeffrey k eliasen
I'm trying to do some image processing using OpenCV. Later I'll be doing some video processing as well. In a future project I will be using R to do deep analysis on some data I'm collecting. In all these cases, what I want to do is very simple with external languages but very hard with both