Re: Write to table from Accumulo iterator

2014-04-25 Thread BlackJack76
As long as my tablets stay constant, I have no problem using a BatchWriter in an iterator. -- View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Write-to-table-from-Accumulo-iterator-tp9412p9422.html Sent from the Users mailing list archive at Nabble.com.

Re: Write to table from Accumulo iterator

2014-04-25 Thread BlackJack76
I will have to take a look at Keith's project. Do you have any experience with the the scanner or writer deadlocking in an iterator? Or did you hear that somewhere? Just curious because for the most part it has worked well. Only had one problem recently and trying to find out if there is a bett

Re: Write to table from Accumulo iterator

2014-04-25 Thread Josh Elser
I don't believe heavy load is a requirement. I'm pretty sure you can deadlock pretty easily if you try writing within an iterator. Focus on Accismus would be best IMO, but, like Bill said, it's probably not fully there. On 4/25/14, 11:42 PM, William Slacum wrote: Our own Keith Turner is tryi

Re: Write to table from Accumulo iterator

2014-04-25 Thread BlackJack76
I am trying to have them run in parallel instead of running them serially. -- View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Write-to-table-from-Accumulo-iterator-tp9412p9419.html Sent from the Users mailing list archive at Nabble.com.

Re: Write to table from Accumulo iterator

2014-04-25 Thread BlackJack76
Yes sir -- View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Write-to-table-from-Accumulo-iterator-tp9412p9418.html Sent from the Users mailing list archive at Nabble.com.

Re: Write to table from Accumulo iterator

2014-04-25 Thread William Slacum
Our own Keith Turner is trying to make this possible with Accismus ( https://github.com/keith-turner/Accismus). I don't know the current state of it, but I believe it's still in the early stages. I've always been under the impression that launching a scanner or writer from within an iterator, as i

Re: Write to table from Accumulo iterator

2014-04-25 Thread David Medinets
Can you change the ingest process to token on ingest? On Fri, Apr 25, 2014 at 10:45 PM, BlackJack76 wrote: > Sure thing. Basically, I am attempting to index a document. When I find > the > document, I want to insert the tokens directly back into the table. I want > to do it directly from the

Re: Write to table from Accumulo iterator

2014-04-25 Thread Russ Weeks
Like, building the index lazily? Very interesting idea... -Russ On Friday, April 25, 2014, BlackJack76 wrote: > Sure thing. Basically, I am attempting to index a document. When I find > the > document, I want to insert the tokens directly back into the table. I want > to do it directly from t

Re: Write to table from Accumulo iterator

2014-04-25 Thread BlackJack76
Sure thing. Basically, I am attempting to index a document. When I find the document, I want to insert the tokens directly back into the table. I want to do it directly from the seek routine so that I don't need to return anything back to the client. For example, seek may locate the document th

Re: Write to table from Accumulo iterator

2014-04-25 Thread Mike Drob
Can you share a little more about what you are trying to achieve? My first thought would be to try looking at the Conditional Mutations present in 1.6.0 (not yet released) as either a ready implementation our a starting point for your own code. On Apr 25, 2014 10:13 PM, "BlackJack76" wrote: > I a

Write to table from Accumulo iterator

2014-04-25 Thread BlackJack76
I am trying to figure out the best way to write to the table from inside the seek method of a class that implements SortedKeyValueIterator. I originally tried to create a BatchWriter and just use that to write data. However, if the tablet moved during a flush then it would hang. Any other recomm

Re: Embedded Mutations: Is this kind of thing done?

2014-04-25 Thread Geoffry Roberts
I think you told me something. I must watch the rowid colfam colq sequence and be sure they are unique within the row. Will do. I believe I do have distinct datatypes for now (they're medical) but the future may rear it's ugly head. On Fri, Apr 25, 2014 at 11:02 AM, Josh Elser wrote: > I migh

Re: Embedded Mutations: Is this kind of thing done?

2014-04-25 Thread Josh Elser
I might be causing more confusion. Consider the following: {"name":"Josh", "age":85} If you stored the attribute name in the colf and the type (string or int) in the colq, it works fine for the above document. Now consider the following document, say where there were multiple sources of my a

Re: Embedded Mutations: Is this kind of thing done?

2014-04-25 Thread Geoffry Roberts
Ok Josh, you have me worried. I am storing the object's name in the colfam: e.g. "patientId", the object's data type goes in the colq: e.g "org.hl7.v3.II", then the value in the colval. I think the largest graph I'm likely to have is < 5k and you say I soul have memory problems. This is good top

Re: Embedded Mutations: Is this kind of thing done?

2014-04-25 Thread Josh Elser
Not necessarily. If you are storing just the type in the colq and have one value and type per document/row, you won't have a problem. If you have more than one value in a type per document/row, the last one you inserted will be what sticks (which is likely undesirable). Of course, this is also

Re: Embedded Mutations: Is this kind of thing done?

2014-04-25 Thread Geoffry Roberts
Thanks for the comments. I'm using the qualifier to tell me the type of the value. Sounds like I'm misusing it. My EMF documents are running no more than 5k so I gather a row will fit into memory well enough. On Fri, Apr 25, 2014 at 9:29 AM, Mike Drob wrote: > Large rows are only an issue i

Re: Embedded Mutations: Is this kind of thing done?

2014-04-25 Thread Mike Drob
Large rows are only an issue if you are going to try to put the entire row in memory at once. As long as you have small enough entries in the row, and can treat them individually, you should be fine. The qualifier is anything that you want to use to determine uniqueness across keys. So yes, this s

Re: Embedded Mutations: Is this kind of thing done?

2014-04-25 Thread Eric Newton
I don't have detailed knowledge of your key, but generally speaking: A row can have billions of columns. There is no assumption in accumulo that the row will fit in memory. Of course, a single mutation will need to fit in memory. A row will always be served from just a single server, so its imp

Re: Embedded Mutations: Is this kind of thing done?

2014-04-25 Thread Geoffry Roberts
Interesting, multiple mutations that is. Are we talking multiples on the same row id? Upon reflection, I realized the embedded thing is nothing special. I think I'll keep adding columns to a single mutation. This will make for a wide row, but I'm not seeing that as a problem. I am I being naiv