Thanks JD. I will shut down one of the ZK instance.
To Michael and JD,I will start another thread regarding the
performances with more details.
JM
2012/6/26, Michael Segel :
>> Network is always good to check, it's all fun and games until an
>> interface negotiates 100Mb.
>>
>> 50ms per get soun
> Network is always good to check, it's all fun and games until an
> interface negotiates 100Mb.
>
> 50ms per get sounds a bit extreme.
Funny you should mention hardware.
I did submit a talk on cluster design to Strata (NY and London) Seems it didn't
make the cut on NY, but who knows about Lond
On Tue, Jun 26, 2012 at 10:56 AM, Jean-Marc Spaggiari
wrote:
> Am I better to run it on 1? Or on 3? I just want to do some testing
> for now. but for ZK, can I keep it in only one server for now? Or it will be
> more efficient if Iconfigure it on 3?
FWIW your system will be as available is PC1 i
>>>>> able to compare the 2 options I have (HBase or relational) and
>>>>>>>>>>>> decide
>>>>>>>>>>>> based on the results.
>>>>>>>>>>>>
>>>>>>>>>>>
gt;>> assuming
>>>>>>>>>>>> that what you are describing is a stand alone system.
>>>>>>>>>>>> It would be easier and better if you just used a simple
>>>>>>>>>>&
t;>>>>>>> Retrieve the data in Ascending order by timestamp and take the
>>>>>>>>>>> top
>>>>> 500
>>>>>>>>>>> off
>>>>>>>>>>> the list.
>>>>>>>>>>>
>>>>>>>>>>> If y
gt;>>>>>
>>>>>>>>>> Now you can just do a partial scan on your index. Because your
>>>>>>>>>> index
>>>>>>>>>> table
>>>>>>>>>> is so small... you shouldn't worry
>>>
>>>>>>>>>> My goal is to store information about files into the database. I
>>> need
>>>>>>>>>> to check the oldest files in the database to refresh the
>>> information.
>>>>>>>>>>
&g
t; update
>> >>>>>>> its
>> >>>>>>> row in the database based on the key including a "last_update"
>> field.
>> >>>>>>> I can calculate this key for any file in the drives.
>> >>>>>>>
>>
t;>
> >>>>>>> I'm mainly doing 3 operations in the database.
> >>>>>>> 1) I retrieve a list of 500 files which need to be update
> >>>>>>> 2) I update the information for those 500 files (bulk update)
> >>>>>
gt;>> distribution is almost perfect because I'm using hash. The prefix is
>>>>>>> the server ID but it's not always going to the same server since it's
>>>>>>> done by last_update. But this allow a quick access to the list of
>>>>>>> files from one server.
>
he
>>>>>> network, but it should be "often". The faster the database update
>>>>>> will be, the more up to date I will be able to keep it.
>>>>>>
>>>>>> JM
>>>>>>
>>>>>> 2012/6/14, Michael Se
t;>> field.
>>>>>>>>> I can easily find back the file with the ID and find when it has
>>>>>>>>> been
>>>>>>>>> updated.
>>>>>>>>>
>>>>>>>>> But what
o be against the key.
>>>>> Not sure if you meant to say that your key was a composite key or
>>>>> not...
>>>>> sounds like your key is just the unique key and the rest are columns
>>>>> in
>>>>> the
>>>>>
gt;> key, you will end up creating new rows of data while the old row still
>>>> exists.
>>>> Not a good side effect.
>>>>
>>>> The other nasty side effect of using time as your key is that you not
>>>> only
>>>> have the potential for hot spotti
e -
>http://sematext.com/spm
>
>
>
>>
>> From: Jean-Marc Spaggiari
>>To: user@hbase.apache.org
>>Sent: Wednesday, June 13, 2012 12:16 PM
>>Subject: Timestamp as a key good practice?
>>
>>I watched Lars George's v
.
>>>>
>>>> The other nasty side effect of using time as your key is that you not
>>>> only
>>>> have the potential for hot spotting, but that you also have the nasty
>>>> side
>>>> effect of creating splits that will never grow.
>>>>
they were not
>>> updated
>>> in the last couple of days/minutes? If its infrequent, then you really
>>> should care if you have to do a complete table scan.
>>>
>>>
>>>
>>>
>>> On Jun 14, 2012, at 5:39 AM, Jean-Marc S
ematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
>>>
>>> Thanks,
>>>
>>> JM
>>>
>>> 2012/6/14, Otis Gospodnetic :
>>>> JM, have a look at https://github.com/sematext/HBa
; Thanks,
>>
>> JM
>>
>> 2012/6/14, Otis Gospodnetic :
>>> JM, have a look at https://github.com/sematext/HBaseWD (this comes up
>>> often.... Doug, maybe you could add it to the Ref Guide?)
>>>
>>> Otis
>>>
>>> Performance Monitoring for Solr / Ela
arch / HBase -
>> http://sematext.com/spm
>>
>>
>>
>>> ____
>>> From: Jean-Marc Spaggiari
>>> To: user@hbase.apache.org
>>> Sent: Wednesday, June 13, 2012 12:16 PM
>>> Subject: Timestamp as a key good
tp://sematext.com/spm
>
>
>
>>
>> From: Jean-Marc Spaggiari
>>To: user@hbase.apache.org
>>Sent: Wednesday, June 13, 2012 12:16 PM
>>Subject: Timestamp as a key good practice?
>>
>>I watched Lars George's video about HBase and read the documentation
i
>To: user@hbase.apache.org
>Sent: Wednesday, June 13, 2012 12:16 PM
>Subject: Timestamp as a key good practice?
>
>I watched Lars George's video about HBase and read the documentation
>and it's saying that it's not a good idea to have the timestamp as a
>key bec
I watched Lars George's video about HBase and read the documentation
and it's saying that it's not a good idea to have the timestamp as a
key because that will always load the same region until the timestamp
reach a certain value and move to the next region (hotspotting).
I have a table with a uni
24 matches
Mail list logo