Re: Mismatch in length of source:

2017-01-02 Thread Ulul
Hi I can't remember the exact error message but distcp consistently fails when trying to copy open files. Is it your case ? Workaround it to snapshot prior to copying Ulul On 31/12/2016 19:25, Aditya exalter wrote: Hi All, A very happy new year to ALL. I am facing issue

Re: HDFS client with Knox

2017-01-02 Thread Ulul
id in d), and then clear streams between knox and HDFS servers, the cluster being protected by a firewall of some kind. Please note that Knox creates a bottleneck through which all data is flowing so don't use it for massive data transfer Ulul On 01/01/2017 15:46, Ted Yu wrote: Can y

Re: how to bound hadoop with network card

2016-12-27 Thread Ulul
-project-dist/hadoop-hdfs/HdfsMultihoming.html Ulul || On 26/12/2016 08:23, lk_hadoop wrote: hi,all: on my cluster ,one node have two network card one for intranet one for public network ,even I have config the /etc/hosts to bound hostname with intranet IP. hadoop itself also recognize the

Re: issues about hadoop-0.20.0

2015-07-18 Thread Ulul
Hi I'd say than no matter what version is running, parameters seem not fit the cluster that doesn't manage to handle 100 maps that each process a billion samples : it's hitting the mapreduce timeout of 600 seconds I'd try with something like 20 10 Ulul Le 18/07/2

SolR integration in HDP

2015-03-08 Thread Ulul
, I could as well stick to Elasticsearch - that I know - since they release MR integration and Yarn support as OSS and that I can integrate with HBase through Phoenix and JDBC. Any thoughts and feedback welcome Thank you Ulul

Re: AW: AW: Hadoop 2.6.0 - No DataNode to stop

2015-03-03 Thread Ulul
ng as you're using the same user to start and stop your daemons. Ulul Le 03/03/2015 00:14, Daniel Klinger a écrit : Hi, thanks for your help. The HADOOP_PID_DIR variable is pointing to /var/run/cluster/hadoop (which has hdfs:hadoop) as it’s owner. 3 PID are created there (datanode na

Re: AW: Hadoop 2.6.0 - No DataNode to stop

2015-03-02 Thread Ulul
Hi The hadoop-daemon.sh script prints the no $command to stop if it doesn'f find the pid file. You should echo the $pid variable and see if you hava a correct pid file there. Ulul Le 02/03/2015 13:53, Daniel Klinger a écrit : Thanks for your help. But unfortunatly this didn’t do th

Re: cleanup() in hadoop results in aggregation of whole file/not

2015-03-01 Thread Ulul
Hi I probably misunderstood your question because my impression is that it's typically a job for a reducer. Emit "local" min and max with two keys from each mapper and you will easily get gobal min and max in reducer Ulul Le 28/02/2015 14:10, Shahab Yunus a écrit : As far

Re: cleanup() in hadoop results in aggregation of whole file/not

2015-03-01 Thread Ulul
Edit : instead of buffering in Hash and then emitting at cleanup you can use a combiner. Likely slower but easier to code if speed is not your main concern Le 01/03/2015 13:41, Ulul a écrit : Hi I probably misunderstood your question because my impression is that it's typically a job

Re: Hadoop 2.6.0 - No DataNode to stop

2015-03-01 Thread Ulul
Hi Did you check your slaves file is correct ? That the datanode process is actually running ? Did you check its log file ? That the datanode is available ? (dfsadmin -report, through the WUI) We need more detail Ulul Le 28/02/2015 22:05, Daniel Klinger a écrit : Thanks but i know how to

Re: Default Block Size in HDFS

2015-02-22 Thread Ulul
Sorry forgot that "as of" meant "starting with" :-) Actually 128 MB started around 2.2.0 2.0.5-alpha was still 64MB In any case it's just a default, it is often raised on production Le 22/02/2015 20:51, Ted Yu a écrit : As of Hadoop 2.6, default blocksize is 128 MB (look for dfs.blocksize) htt

Re: Using Slider as a default mechanism for HBase on HDP 2.2

2015-02-22 Thread Ulul
n't hesitate to share Ulul Le 22/02/2015 16:33, Krishna Kishore Bonagiri a écrit : Hi, We just installed HDP 2.2 through Ambari. We were under the impression that in HDP 2.2., the default deployment mechanism for HBase/Accumulo is through Slider (i.e., they are enabled by default for Y

Re: BLOCK and Split size question

2015-02-22 Thread Ulul
combined to make an input split. To complete the non aligned block answer : mapreduce will download the missing record part for you from an other DN. Cheers Ulul Le 22/02/2015 03:19, Ahmed Ossama a écrit : Hi, Answering the first question; What happens is that the client on the ma

Re: Hadoop - HTTPS communication between nodes - How to Confirm ?

2015-02-21 Thread Ulul
of -p | grep TCP will show you that DN listening on 50075 for HTTP, 50475 for HTTPS. For namenode that would be 50070 and 50470 Ulul Le 21/02/2015 19:53, hadoop.supp...@visolve.com a écrit : Hello Everyone, We are trying to measure performance between HTTP and HTTPS version on Hadoop DFS, Mapr

Re: Missing FSImage file

2015-02-21 Thread Ulul
If you don't have a coherent set of files, namenode recover is your friend (never had to use that though, can just wish you luck...) Ulul Le 20/02/2015 12:21, tHejEw limudomsuk a écrit : Dear all I run hadoop 2.4.0 with single node since Nov 2014, After I > receive error from we

Re: Yarn AM is abending job when submitting a remote job to cluster

2015-02-19 Thread Ulul
22:56, Ulul a écrit : In that case it's just between your hdfs client, the NN and the DNs, no YARN or MR component involved. The fact that this works is not related to your MR job not succeeding. Le 19/02/2015 22:45, roland.depratti a écrit : Thanks for looking at my problem. I can run an hd

Re: Yarn AM is abending job when submitting a remote job to cluster

2015-02-19 Thread Ulul
m the client, with the config file listed, that does a cat on a file in hdfs on the remote cluster and returns the contents of that file to the client. - rd Sent from my Verizon Wireless 4G LTE smartphone Original message From: Ulul Date:02/19/2015 4:03 PM (GMT-05:00) To:

Re: Yarn AM is abending job when submitting a remote job to cluster

2015-02-19 Thread Ulul
'hdfs://' though the log doesn't suggest it has anything to do with your problem And what do you mean by an "HDFS job" ? Ulul Le 19/02/2015 04:22, daemeon reiydelle a écrit : I would guess you do not have your ssl certs set up, client or server, based on the error. *

Re: adding another namenode to an existing hadoop instance

2015-02-18 Thread Ulul
ate a standby node, you need to configure HA Ulul Le 18/02/2015 21:14, Ulul a écrit : Hi This is not your hadoop version but your java version you displayed For hadoop version remove the dash : hdfs version The dfs.namenode.name.dir is the dir for the namenode process to store the filesystem

Re: adding another namenode to an existing hadoop instance

2015-02-18 Thread Ulul
lding production data Cheers Ulul Le 18/02/2015 10:50, Mich Talebzadeh a écrit : Hi, I have a Hadoop instance (single node) installed on RHES5 running OK. The version of Hadoop is hdfs -version java version "1.7.0_25" Java(TM) SE Runtime Environment (build 1.7.0_25-b15) Java HotSpot(TM

Re: free some space on data volume

2015-02-16 Thread Ulul
Hi Check out the two last comments for HDFS-1312 : https://issues.apache.org/jira/browse/HDFS-1312 If you can't afford to wipe out DN data and rebalance, you can try out the proposed script (after non-prod testing obviously) Ulul Le 16/02/2015 09:52, Georgi Ivanov a écrit : Hi, I ne

Re: Question about mapp Task and reducer Task

2015-02-16 Thread Ulul
;t give you feedback right now, sorry Ulul Le 16/02/2015 00:47, 杨浩 a écrit : hi ulul thank you for explanation. I have googled the feature, and hortonworks said This feature is a technical preview and considered under development. Do not use this feature in your production systems. can we

Re: Question about mapp Task and reducer Task

2015-02-15 Thread Ulul
Hi Actually it depends : in MR1 each mapper or reducer will be exezcuted in its own JVM, in MR2 you can activate uberjobs that will let the framework serialize small jobs' mappers and reducers in the applicationmaster JVM. Look for mapreduce.job.ubertask.* properties Ulul Le 15/02/20

WebHDFS and 100-continue

2015-02-15 Thread Ulul
would be kept and that we would have : To NN: Expect: 100-continue Response 307 DN To DN : Expect: 100-continue Response 100-continue So the two steps would be kept What do I miss there ? Thanks for enlightment Ulul https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html

Re: HDFS openforwrite CORRUPT -> HEALTHY

2014-10-11 Thread Ulul
Hi Vinayak, Sorry this is beyond my understanding. I would need to test furthet to try and understand the problem. Hope you'll find help from someone else Ulul Le 08/10/2014 07:18, Vinayak Borkar a écrit : Hi Ulul, I think I can explain why the sizes differ and the block names vary.

Re: TestDFSIO and hadoop config options

2014-10-07 Thread Ulul
Hi I would also go for the testdfsio option passing way Once your write test is over you can check how many replicas were created for each file with hdfs fsck -files -blocks Ulul Le 07/10/2014 09:27, Bart Vandewoestyne a écrit : Hello list, I would like to experiment with TestDFSIO and

Re: HDFS openforwrite CORRUPT -> HEALTHY

2014-10-07 Thread Ulul
Hi Vinayak I find strange that the file should have a different size and the block a different name. Are you sure your writing client wasn't interfering ? Ulul Le 07/10/2014 19:41, Vinayak Borkar a écrit : Trying again since I did not get a reply. Please let me know if I should

Re: Hadoop and RAID 5

2014-10-07 Thread Ulul
Yes, I also read in P420 user guide that it was RAID only. We'll live with it I guess... Thanks for the HP/Cloudera link, it's precious reading ! Le 07/10/2014 08:52, Travis a écrit : On Sun, Oct 5, 2014 at 4:17 PM, Ulul <mailto:had...@ulul.org>> wrote: Hi Travis

Re: [Blog] Doubts On CCD-410 Sample Dumps on Ecosystem Projects

2014-10-06 Thread Ulul
s and mailing lists :-) Ulul Le 06/10/2014 13:54, unmesha sreeveni a écrit : what about the last one? The answer is correct. Pig. Is nt it? On Mon, Oct 6, 2014 at 4:29 PM, adarsh deshratnam mailto:adarsh.deshrat...@gmail.com>> wrote: For question 3 answer should be B and for questi

Re: Reduce fails always

2014-10-06 Thread Ulul
Hello Did you check you don't have a job.setNumReduceTasks(1); in your job driver ? And you should check the number of slots available on the jobtracker web interface Ulul Le 06/10/2014 20:34, Abdul Navaz a écrit : Hello, I have 8 Datanodes and each having storage capacity of only 3G

Re: Reduce phase of wordcount

2014-10-06 Thread Ulul
t 7... Combiners are very effective to limit shuffle overhead and for a job as wordcount you can just use the reduce class. Just add something like job.setCombinerClass(MyReducer.class); to your driver and you're good Ulul Le 06/10/2014 21:18, Renato Moutinho a écrit : Hi folks, j

Re: Reduce phase of wordcount

2014-10-05 Thread Ulul
Hi You indicate that you have just one reducer, which is the default in Hadoop 1 but quite insufficient for a 7 slave nodes cluster. You should increase mapred.reduce.tasks use combiners and maybe tune mapred.reduce.tasktracker.reduce.tasks.maximum Hope that helps Ulul Le 05/10/2014 16:53

Re: Hadoop and RAID 5

2014-10-05 Thread Ulul
hat's necessary for it not to disrupt operations Thanks again Ulul Le 02/10/2014 00:25, Travis a écrit : On Wed, Oct 1, 2014 at 4:01 PM, Ulul <mailto:had...@ulul.org>> wrote: Dear hadoopers, Has anyone been confronted to deploying a cluster in a traditional IT shop

Hadoop and RAID 5

2014-10-01 Thread Ulul
share your experiences around RAID for redundancy (1, 5 or other) in Hadoop conf. Thank you Ulul

Re: Doubt Regarding QJM protocol - example 2.10.6 of Quorum-Journal Design document

2014-09-28 Thread Ulul
Hi A developer should answer that but a quick look to an edit file with od suggests that record are not fixed length. So maybe the likeliness of the situation you suggest is so low that there is no need to check more than file size Ulul Le 28/09/2014 11:17, Giridhar Addepalli a écrit : Hi

Re: Map job not finishing

2014-09-07 Thread Ulul
Oops, you're using HDP 2.1 which means Hadoop 2.4, so property name is mapreduce.tasktracker.map.tasks.maximum and more importantly it should be irrelevant using Yarn for which map slots don't matter. Explanation anyone ? Ulul Le 07/09/2014 22:35, Ulul a écrit : Hi Adding an anot

Re: Map job not finishing

2014-09-07 Thread Ulul
Hi Adding an another TT may not be the only way, increasing mapred.tasktracker.map.tasks.maximum could also do the trick Explanation there : http://www.thecloudavenue.com/2014/01/oozie-hangs-on-single-node-for-work-flow-with-fork.html Cheers Ulul Le 07/09/2014 01:01, Rich Haase a écrit