Using DirectFileOutputCommitter with distcp

2017-03-16 Thread Daniel Haviv
Hi, Is it possible to use DirectFileOutputCommitter with distcp ? Thank you, Daniel

Random nodemanager crashes SIGSEGV

2016-08-22 Thread Daniel Haviv
Hi, In the last 24 hours our node managers keep crashing due to SIGSEGV. The only info I could find was in the hs_err_.pid files which includes the following java stack: Stack: [0x7f756a30f000,0x7f756a41], sp=0x7f756a40dea0, free space=1019k Native frames: (J=compiled

Re: Fast way to read thousands of double value in hadoop jobs

2016-08-19 Thread Daniel Haviv
jobs did repeated conversion from Text to double. I resolved > this by correcting SequenceFile format. Now I store serialised java object > in SeqFile and my map jobs are faster. > > -- > Madhav Sharan > > > On Wed, Aug 17, 2016 at 11:07 PM, Daniel Haviv <danielru...@gmail

Re: Fast way to read thousands of double value in hadoop jobs

2016-08-18 Thread Daniel Haviv
Store them within a sequencefile On Thursday, 18 August 2016, Madhav Sharan wrote: > Hi , can someone please recommend a fast way in hadoop to store and > retrieve matrix of double values? > > As of now we store values in text files and the read it in java using HDFS >

Re: NodeManager High CPU due to high GC

2016-01-23 Thread Daniel Haviv
Hi Randy, How much cores do you have on your machines and how much did you allocate for Yarn? Daniel On Saturday, 23 January 2016, Randy Fox wrote: > Hi, > > We just upgraded to using Yarn on Hadoop 2.6.0 – CDH5.4.5 > We are running a large job – 200K mappers, 100K reducers

Re: failed to start namenode

2015-11-20 Thread Daniel Haviv
Are you sure the host is up? On Friday, 20 November 2015, siva kumar wrote: > Hi Sandeep, > Im tryning to start using cloudera manager . This > is the error message im getting. The log is not getting generated in the > log directory. > > Supervisor

Re: Chaining MapReduce

2015-08-22 Thread Daniel Haviv
Hi, Data is divided to mappers depending on your inputformat. Usually the number of mappers = number of blocks. Daniel On 22 באוג׳ 2015, at 09:02, ☼ R Nair (रविशंकर नायर) ravishankar.n...@gmail.com wrote: Hi , The mappers depend on source data only. But data definitely is going through

Container isolation

2015-06-26 Thread Daniel Haviv
Hi, Is there some kind a security aspect to a container in terms of local filesystem access? Is it possible for example to chroot for containers so they won't be able to read/write to anywhere on the local FS but their own home dir? Thanks, Daniel

Re: Copy data between clusters during the job execution.

2015-02-02 Thread Daniel Haviv
You can use distcp Daniel On 2 בפבר׳ 2015, at 11:12, xeon Mailinglist xeonmailingl...@gmail.com wrote: Hi I want to have a job that copies the map output, or the reduce output to another hdfs. Is is possible? E.g., the job runs in cluster 1 and takes the input from this cluster.

Re: Copy data between clusters during the job execution.

2015-02-02 Thread Daniel Haviv
? On 02-02-2015 10:20, Daniel Haviv wrote: an use distcp Daniel On 2 בפבר׳ 2015, at 11:12,

Re: Which [open-souce] SQL engine atop Hadoop?

2015-01-27 Thread Daniel Haviv
Can you elaborate on why you prefer Tajo? Daniel On 27 בינו׳ 2015, at 10:35, Azuryy Yu azury...@gmail.com wrote: You almost list all open sourced MPP real time SQL-ON-Hadoop. I prefer Tajo, which was relased by 0.9.0 recently, and still working in progress for 1.0 On Mon, Jan 26,

Re: Edits log apply performance

2015-01-19 Thread Daniel Haviv
/browse/HDFS-4923 https://issues.apache.org/jira/browse/HDFS-6353 https://issues.apache.org/jira/browse/HDFS-7609 Chris Nauroth Hortonworks http://hortonworks.com/ On Sat, Jan 17, 2015 at 9:17 AM, Daniel Haviv danielru...@gmail.com wrote: Hi, After restarting the namenode we

Edits log apply performance

2015-01-17 Thread Daniel Haviv
Hi, After restarting the namenode we discovered that there was no checkpoint for quite a while. We are waiting for all the changes to be applied to the fsimage, but it seems like it will take hours. Is there something we can do to expedite the process? Increases parallelism? Something at all?