On Fri, Aug 24, 2012 at 6:52 PM, N Keywal wrote:
> Hi Adrien,
>
>> What do you think about that hypothesis ?
>
> Yes, there is something fishy to look at here. Difficult to say
> without more logs as well.
> Are your gets totally random, or are you doing gets on rows that do
> exist? That would e
First off regarding "inefficiency"... If version counting would happen first
and then filter were executed we'd have folks "complaining" about
inefficiencies as well:
("Why does the code have to go through the versioning stuff when my filter
filters the row/column/version anyway?") ;-)
For yo
Thanks for your quick reply.
The co-processor looks like:
public void postGet(final ObserverContext e,
final Get get, final List results)
{
if table is X
get some columns from table Y
add these columns to results
}
And similar for postScannerNext().
This works in
Thanks Harsh,
A two more comments / thoughts,
1. For mapper: mapper normally runs on the same regional server which owns
the row-key range for the mapper input because of locality reasons (I am
not 100% confident whether it is always true mapper always runs on the same
region server, please feel
Hi All,
Here are the steps i followed to load the table with HFilev1 format:
1. Set the property hfile.format.version to 1.
2. Updated the conf across the cluster.
3. Restarted the cluster.
4. Ran the bulk loader.
Table has 34 million records and one column family.
Results:
HDFS space for one rep
Unfortunately the way I am reading/writing data from/to parts of my table would
be incompatible with this solution.
In any case, thank you very much for your time.
On Aug 28, 2012, at 4:10, Mohit Anchlia wrote:
> Have you thought of making your row key as key+timestamp? And then you can
> do
Have you thought of making your row key as key+timestamp? And then you can
do scan on the columns itself?
On Mon, Aug 27, 2012 at 5:53 PM, Ioakim Perros wrote:
> Of course, thank you for responding.
>
> I have an iterative procedure where I get and put data from/to an HBase
> table, and I am set
Hi Lars:
Thanks for confirming the inefficiency of the implementation for this case. For
my case, a column can have more than 10K versions, I need a quick way to stop
the scan from digging the column once there is a match (ReturnCode.INCLUDE). It
would be nice to have a ReturnCode that can noti
Of course, thank you for responding.
I have an iterative procedure where I get and put data from/to an HBase
table, and I am setting at each Put the timestamp equal to each
iteration's number, as it is efficient to check for convergence in this
way (by just retrieving the 2 last versions of my
You timestamp as in version? Can you describe your scenario with more
concrete example?
On Mon, Aug 27, 2012 at 5:01 PM, Ioakim Perros wrote:
> Hi,
>
> Is there any way of retrieving two values with totally different
> timestamps from a table?
>
> I am using timestamps as iteration counts, and I
Currently filters are evaluated before we do version counting.
Here's a comment from ScanQueryMatcher.java:
/**
* Filters should be checked before checking column trackers. If we do
* otherwise, as was previously being done, ColumnTracker may increment its
* counter for even tha
Hi,
Is there any way of retrieving two values with totally different
timestamps from a table?
I am using timestamps as iteration counts, and I would like to be able
to get at each iteration (besides the previous iteration results from
table) some pre-computed amounts I save at some columns w
Also confirmed via experiment (in the memstore, store files, mixed store files,
mixed store files and memstore).
-- Lars
- Original Message -
From: Lars H
To: user@hbase.apache.org
Cc:
Sent: Monday, August 27, 2012 3:52 PM
Subject: Re: MemStore and prefix encoding
Oops. The KVs are
Oops. The KVs are sorties in reverse chronological order. So I was wrong. It'll
return newest version.
Sorry about that confusion. The book is correct.
-- Lars
Tom Brown schrieb:
>Lars,
>
>I have been relying on the expected behavior (if I write another cell
>with the same {key, family, qual
Hi Alex:
We decided to use setTimeRange and setMaxVersions, and remove the column
with a reference timestamp (i.e. we don't put this column into hbase
anymore). This behavior is what we would like but it seems very inefficient
because all versions are processed before the setMaxVersions takes effe
Anil,
Please let us know how well this works.
On Mon, Aug 27, 2012 at 4:19 PM, anil gupta wrote:
> Hi Guys,
>
> I was digging through the hbase-default.xml file and i found this property
> relates HFile handling:
>
>
> hfile.format.version
> 2
>
> The HFile
On Mon, Aug 27, 2012 at 9:20 AM, Tom Brown wrote:
> Lars,
>
> I have been relying on the expected behavior (if I write another cell
> with the same {key, family, qualifier, version} it won't return the
> previous one) so you're answer was confusing to me. I did more
> research and I found that the
Hi Guys,
I was digging through the hbase-default.xml file and i found this property
relates HFile handling:
hfile.format.version
2
The HFile format version to use for new files. Set this to 1 to
test
backwards-compatibility. The default value of this op
On Mon, Aug 27, 2012 at 11:04 AM, Doug Meil
wrote:
>
> I think somewhere in here in the RefGuide would workŠ
>
> http://hbase.apache.org/book.html#other.info.sites
>
>
That looks good. We don't have a pig section in the refguide? You up
for adding a paragraph Russell? Could link to your blog i
I think somewhere in here in the RefGuide would workŠ
http://hbase.apache.org/book.html#other.info.sites
On 8/27/12 1:20 PM, "Stack" wrote:
>On Mon, Aug 27, 2012 at 6:32 AM, Russell Jurney
> wrote:
>> I wrote a tutorial around HBase, JRuby and Pig that I thought would be
>>of
>> interest
Hi there, in addition there is a fair amount of documentation about bulk
loads and importtsv in the Hbase RefGuide.
http://hbase.apache.org/book.html#importtsv
On 8/27/12 9:34 AM, "Ioakim Perros" wrote:
>On 08/27/2012 04:18 PM, o brbrs wrote:
>> Hi,
>>
>> I'm new at hase and i want to make
On Mon, Aug 27, 2012 at 10:31 AM, Russell Jurney
wrote:
> Yes, and if possible the HBase and JRuby page needs to be updated. If you
> can grant me wiki access, I can edit it myself.
>
> http://wiki.apache.org/hadoop/Hbase/JRuby
>
I added access for a login of RussellJurney (Sorry. I believe thi
Yes, and if possible the HBase and JRuby page needs to be updated. If you
can grant me wiki access, I can edit it myself.
http://wiki.apache.org/hadoop/Hbase/JRuby
On Mon, Aug 27, 2012 at 10:20 AM, Stack wrote:
> On Mon, Aug 27, 2012 at 6:32 AM, Russell Jurney
> wrote:
> > I wrote a tutorial
On Mon, Aug 27, 2012 at 6:32 AM, Russell Jurney
wrote:
> I wrote a tutorial around HBase, JRuby and Pig that I thought would be of
> interest to the HBase users list:
> http://hortonworks.com/blog/pig-as-hadoop-connector-part-two-hbase-jruby-and-sinatra/
>
Thanks Russell. Should we add a link in
Lars,
I have been relying on the expected behavior (if I write another cell
with the same {key, family, qualifier, version} it won't return the
previous one) so you're answer was confusing to me. I did more
research and I found that the HBase guide specifies that behavior (see
section 5.8.1 of htt
Not necessarily consecutive, unless the request itself is so. It only
returns 500 rows that match the user's request.
User's request of a specific row-range and filters are usually
embedded into the Scan object, sent to the RS. Whatever is accumulated
as the result of the Scan operation (server-si
Hi Harsh,
I read through the document you referred, for the below comment, I am
confused. Major confusion is, does it mean HBase will transfer consecutive
500 rows to client (supposing client mapper want row with row-key 100,
Hbase will return row-key from 100 to 600 at one time to client, similar
I want to do a lot of random reads, but I need to get the first row after
the requested key. I know I can make a scanner every time (with a specified
startrow) and close it after a single result is fetched, but this seems
like a lot overhead.
Something like HTable's getRowOrBefore method, but then
On 08/27/2012 04:18 PM, o brbrs wrote:
Hi,
I'm new at hase and i want to make bulk load from hdfs to hbase with java.
Is there any sample code which includes importtsv and completebulkload
libraries on java?
Thanks.
Hi,
Here is a sample configuration of a bulk loading job consisting only of
Allow me to refer to previous discussion:
http://mail-archives.apache.org/mod_mbox/hbase-user/201203.mbox/%3CCABsY1jQ8+OiLh7SYkXZ8iO=nosy8khz7iys+6w4u6sxcpj5...@mail.gmail.com%3E
If the above doesn't answer your question, please give us more details
about the versions of HBase and PIG you're using
Hi,
I've created a co-processor which will insert more columns in the
result by overriding preGet and postScannerNext.
And it is registered as a system co-processor, works with hbase shell,
both get and scan.
When I tried to access those columns in PIG, it simply returns null
for the column.
Is the
31 matches
Mail list logo