Re: Task re-scheduling in hadoop
Moving to mapreduce-user@, bcc common-user@ On Aug 23, 2011, at 2:31 AM, Vaibhav Pol wrote: Hi All, I have some query regarding task re-scheduling.Can it possible to make Job tracker wait for some time before re-scheduling of failed tracker's tasks. Why would you want to do that? Typically, you want the JT to retry the failed tasks as quickly as possible to fail the job rather than try all tasks and fail once. Arun Thanks and regards, Vaibhav Pol National PARAM Supercomputing Facility Centre for Development of Advanced Computing Ganeshkhind Road Pune University Campus PUNE-Maharastra Phone +91-20-25704176 ext: 176 Cell Phone : +919850466409 -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Hadoop integration with SAS
Anyone had worked on Hadoop data integration with SAS? Does SAS have a connector to HDFS? Can it use data directly on HDFS? Any link or samples or tools? Thanks! Jonathan This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the email by you is prohibited.
Re: Hadoop integration with SAS
Forget sas, use pig instead. On Aug 23, 2011 11:22 PM, jonathan.hw...@accenture.com wrote: Anyone had worked on Hadoop data integration with SAS? Does SAS have a connector to HDFS? Can it use data directly on HDFS? Any link or samples or tools? Thanks! Jonathan This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the email by you is prohibited.
Re: Hadoop integration with SAS
R has a connector for Hadoop if it helps.. From: jonathan.hw...@accenture.com jonathan.hw...@accenture.com To: common-user@hadoop.apache.org Sent: Tuesday, 23 August 2011 2:21 PM Subject: Hadoop integration with SAS Anyone had worked on Hadoop data integration with SAS? Does SAS have a connector to HDFS? Can it use data directly on HDFS? Any link or samples or tools? Thanks! Jonathan This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the email by you is prohibited.
Hadoop in Action Partitioner Example
For those of you who has the book, on page 49 there is a custom partitioner example. It basically describes a situation where the map emits K,V, but the key is a compound key like (K1,K2), and we want to reduce over K1s and not the whole of the Ks. This is used as an example of a situation where a custom partitioner should be written to hash over K1 to send the right keys to the same reducers. But as far as I know, although this would partition the keys correctly (send them to the correct reducers), the reduce function would still be called (grouped under) with the original keys K, not yielding the desired results. The only way of doing this that I know of is to create a new WritableComparable, that carries all of K, but only uses K1 for hash/equal/compare methods, in which case you would not need to write your own partitioner anyways. Am I misinterpreting something the author meant, or is there something I don't know going on? It would have been sweet if I could accomplish all that with just the partitioner. Either I am misunderstanding something fundamental, or I am misunderstanding the example's intention, or there is something wrong with it. Thanks, Mehmet
Re: Hadoop in Action Partitioner Example
Job.setGroupingComparatorClass will allow you to define a RawComparator class, in which you can only compare the K1 component of K. The Reduce sort will still sort all K's using the compareTo method of K, but will use the grouping comparator when deciding which values to pass to the reduce method. On Tue, Aug 23, 2011 at 7:25 PM, Mehmet Tepedelenlioglu mehmets...@gmail.com wrote: For those of you who has the book, on page 49 there is a custom partitioner example. It basically describes a situation where the map emits K,V, but the key is a compound key like (K1,K2), and we want to reduce over K1s and not the whole of the Ks. This is used as an example of a situation where a custom partitioner should be written to hash over K1 to send the right keys to the same reducers. But as far as I know, although this would partition the keys correctly (send them to the correct reducers), the reduce function would still be called (grouped under) with the original keys K, not yielding the desired results. The only way of doing this that I know of is to create a new WritableComparable, that carries all of K, but only uses K1 for hash/equal/compare methods, in which case you would not need to write your own partitioner anyways. Am I misinterpreting something the author meant, or is there something I don't know going on? It would have been sweet if I could accomplish all that with just the partitioner. Either I am misunderstanding something fundamental, or I am misunderstanding the example's intention, or there is something wrong with it. Thanks, Mehmet
Re: Hadoop in Action Partitioner Example
Thanks, that is very useful to know. On Aug 23, 2011, at 4:40 PM, Chris White wrote: Job.setGroupingComparatorClass will allow you to define a RawComparator class, in which you can only compare the K1 component of K. The Reduce sort will still sort all K's using the compareTo method of K, but will use the grouping comparator when deciding which values to pass to the reduce method. On Tue, Aug 23, 2011 at 7:25 PM, Mehmet Tepedelenlioglu mehmets...@gmail.com wrote: For those of you who has the book, on page 49 there is a custom partitioner example. It basically describes a situation where the map emits K,V, but the key is a compound key like (K1,K2), and we want to reduce over K1s and not the whole of the Ks. This is used as an example of a situation where a custom partitioner should be written to hash over K1 to send the right keys to the same reducers. But as far as I know, although this would partition the keys correctly (send them to the correct reducers), the reduce function would still be called (grouped under) with the original keys K, not yielding the desired results. The only way of doing this that I know of is to create a new WritableComparable, that carries all of K, but only uses K1 for hash/equal/compare methods, in which case you would not need to write your own partitioner anyways. Am I misinterpreting something the author meant, or is there something I don't know going on? It would have been sweet if I could accomplish all that with just the partitioner. Either I am misunderstanding something fundamental, or I am misunderstanding the example's intention, or there is something wrong with it. Thanks, Mehmet