When the client does group commits does it group by row key or region server?

On Sun, Jun 28, 2009 at 12:08 AM, Ryan Rawson<[email protected]> wrote:
> I imported 9b rows in 5 days or so, a few minor crashes, average speed
> between 50-200 k ops/sec.  The client needs some love to make it more
> efficient on grouping commits during bulk upload.
>
> On Jun 27, 2009 4:02 PM, "Andrew Purtell" <[email protected]> wrote:
>
> Test:
>
> - Latest trunk.
>
> - Config modified only with a store file split threshold of 1GB
>
> - 4 node testbed:
>   1) namenode, datanode, hmaster, heritrix, jobtracker
>   2) datanode, regionserver, heritrix, tasktracker, mapper (2)
>   3) datanode, regionserver, heritrix, tasktracker, mapper (2)
>   4) datanode, regionserver, heritrix, tasktracker, mapper (2)
>
> - 100 heritrix threads - 4 hosts, 25 threads each - feeding in ~5MB/sec
> average new edits
>
> - 2 mappers x 3 hosts processing new edits and writing back
> serialized/compressed Documents
>
> - 3K average transactions/sec reported by master
>
> - 'hadoop balancer -threshold 0.1'
>
> - 1 hour test run
>
> Result:
>
> Passed with no incidents!
>
>  - Andy
>

Reply via email to