Re: Nutch + HBase

2008-06-18 Thread Isabel Drost
On Wednesday 18 June 2008, Dennis Kubes wrote: > was to > may the Nutch architecture itself more flexible. This is what I have > been terming Nutch 2 and what I have been currently working on. Sounds interesting. Can you point me to any further information or discussion threads on Nutch 2? Isab

hbase-0.1.3 release candidate 3 [WAS -> Re: hbase-0.1.3 release candidate 2]

2008-06-18 Thread stack
Michael Bieniosek reports I'd mangled release candidate 2. Here is release candidate 3: http://people.apache.org/~stack/hbase-0.1.3-candidate-3/ Please take it for a spin. Voting (again) closes Friday, June 20th. This release only runs on hadoop 0.16.x. Yours, HBase Team stack wrote: Here

Re: Question about hlog recovery.

2008-06-18 Thread stack
Clint Morgan wrote: I have a local cluster running, and its logging to /log_X.X.X.X_1213228101021_60020/ Then I kill both master and regionserver, and restart. Looking through the logs I don't see anything about trying to recover from this hlog, it just creates a new hlog alongside the existing

hbase-0.1.3 release candidate 2

2008-06-18 Thread stack
Here's a new 0.1.3 release candidate: http://people.apache.org/~stack/hbase-0.1.3-candidate-2/ Please try it out. Voting closes Friday, June 20th. (Candidate 1 was squashed because a couple of more fixes came in: HBASE-680, HBASE-684, and HBASE-686) Thanks, The HBase Team

Re: Nutch + HBase

2008-06-18 Thread Dennis Kubes
We have discussed it but not implemented it. A previous step before implementing interfaces to use HBase for current Nutch databases was to may the Nutch architecture itself more flexible. This is what I have been terming Nutch 2 and what I have been currently working on. Dennis Marcus Hero

Re: Problem with scanner again

2008-06-18 Thread stack
I ain't sure how it'd work -- a callback that regionserver-side scanner can pull on to send the state-laden filter back to the client for passing to next region? -- but I like this idea. I added hbase-695. St.Ack Clint Morgan wrote: Ah, good point. I missed that. I guess to make this work we

Re: Problem with scanner again

2008-06-18 Thread Clint Morgan
Ah, good point. I missed that. I guess to make this work we need to add an RCP method on regionserver to get the current filter, grab it clientside when done with a region, and pass it back to the next region to be scanned (rather than a fresh filter as currently). 2008/6/18 stack <[EMAIL PROTECT

Re: regions reassignment

2008-06-18 Thread stack
Enable DEBUG. Then look at master and regionserver log. My guess is that there is something up on the regionserver that is preventing successful deploy of the region assigned by the master. St.Ack Krzysztof Szlapinski wrote: hi all, I'm just in the middle of a process of creation of more th

Re: Problem with scanner again

2008-06-18 Thread stack
I was going to suggest filters but was wondering what happens when you cross regions or cross regionservers? What happens if regionserver A only has half of the fifth page and you need to go to another server to get the rest? Its filters will know nought of regionserver A's state? Thanks, St

Re: Problem with scanner again

2008-06-18 Thread Clint Morgan
Why the aversion to filters? Thats how we solve this problem, I have a simple SkipRowFilter that I wrote that does exactly this... -clint 2008/6/18 Krzysztof Gałęcki <[EMAIL PROTECTED]>: > I can't cache items on List because of two main reasons: > > 1. I have to many items to cache them in memory

RE: Problem with scanner again

2008-06-18 Thread Krzysztof Gałęcki
I can't cache items on List because of two main reasons: 1. I have to many items to cache them in memory 2. I would also like to use non GUID-based rowIds. In that situation we are talking about inserting items, not just adding them to the end of table (list). I think there should be some method

Re: Problem with scanner again

2008-06-18 Thread stack
Krzych: What J-D says -- an application-side cache associated with user session -- or stash startrows into user session? For example, on first fetch, cut up the first hundred or so results into pages and save the page startrows in user session. When user comes back and asks for a 3rd page, s

regions reassignment

2008-06-18 Thread Krzysztof Szlapinski
hi all, I'm just in the middle of a process of creation of more than 30 empty tables in a row on a cluster of 3 machines I've just ran a batch script in the jruby shell and I'm observing what happens on the different Region Servers I can see that HMaster during this process of creation keeps re

Re: Problem with scanner again

2008-06-18 Thread Jean-Daniel Cryans
Hi krzych, If you really need help fast, come see us on IRC. Details are on the website. My solution for your problem would be to cache the results in a List instead of always scanning the table. jdcryans On Wed, Jun 18, 2008 at 11:20 AM, Krzysztof Gałęcki <[EMAIL PROTECTED]> wrote: > Sorry fo

Problem with scanner again

2008-06-18 Thread Krzysztof Gałęcki
Sorry for spam, but looks like I send this message as a replay to some other post, so one more time: Hi I have a problem with scanner on HTable object: I have a table with GUID-based rowId. I would like to display items from this table on pages (for example 10 items on each page). I'm cr

Re: hbase column data

2008-06-18 Thread Rong-en Fan
On Wed, Jun 18, 2008 at 3:21 PM, yunyi.x <[EMAIL PROTECTED]> wrote: > > can i save lots of data into one cell with different column tag names? > like: > > |rowKey | TimeStamp | Column | > |-| >

hbase column data

2008-06-18 Thread yunyi.x
can i save lots of data into one cell with different column tag names? like: |rowKey | TimeStamp | Column | |-| | key1 | 234255645 |colFamily:tag1 | value1 | |-

hbase column data

2008-06-18 Thread yunyi.x
can i save lots of data into one cell with different column tag names? like: |rowKey | TimeStamp | Column | |-| | key1 | 234255645 |colFamily:tag1 | value1 | |-