Re: aggregation by time window

2013-01-28 Thread Oleg Ruchovets
: since each of your events will go into several buckets, you could use map() to emit each item multiple times for each bucket. Am 28.01.2013 um 13:56 schrieb Oleg Ruchovets oruchov...@gmail.com: Hi , I have such row data structure: event_id | time == event1 | 10

Re: aggregation by time window

2013-01-28 Thread Oleg Ruchovets
) . On 28 January 2013 12:56, Oleg Ruchovets oruchov...@gmail.com wrote: Hi , I have such row data structure: event_id | time == event1 | 10:07 event2 | 10:10 event3 | 10:12 event4 | 10:20 event5 | 10:23 event6 | 10

Re: aggregation by time window

2013-01-28 Thread Oleg Ruchovets
in the original input. Kai Am 28.01.2013 um 14:43 schrieb Oleg Ruchovets oruchov...@gmail.com: Hi Kai. It is very interesting. Can you please explain in more details your Idea? What will be a key in a map phase? Suppose we have event at 10:07. How would you emit

Re: Disks RAID best practice

2012-11-01 Thread Oleg Ruchovets
On Nov 1, 2012, at 7:37 AM, Oleg Ruchovets oruchov...@gmail.com wrote: Hi , What is the best practice for DISKS RAID (Master and Data Nodes). Thanks in advance Oleg.

Hadoop and Cuda , JCuda (CPU+GPU architecture)

2012-09-24 Thread Oleg Ruchovets
Hi I am going to process video analytics using hadoop I am very interested about CPU+GPU architercute espessially using CUDA ( http://www.nvidia.com/object/cuda_home_new.html) and JCUDA ( http://jcuda.org/) Does using HADOOP and CPU+GPU architecture bring significant performance improvement and

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

2012-09-24 Thread Oleg Ruchovets
Thank you very much. I saw this link !!! . Do you have any code , example shared in the network (github for example). On Mon, Sep 24, 2012 at 5:33 PM, Chen He airb...@gmail.com wrote: http://wiki.apache.org/hadoop/CUDA%20On%20Hadoop On Mon, Sep 24, 2012 at 10:30 AM, Oleg Ruchovets oruchov

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

2012-09-24 Thread Oleg Ruchovets
. On Mon, Sep 24, 2012 at 10:45 AM, Oleg Ruchovets oruchov...@gmail.com wrote: Thank you very much. I saw this link !!! . Do you have any code , example shared in the network (github for example). On Mon, Sep 24, 2012 at 5:33 PM, Chen He airb...@gmail.com wrote: http://wiki.apache.org

Unexpected end of input stream (GZ)

2012-07-24 Thread Oleg Ruchovets
Hi , I got such exception running hadoop job: java.io.EOFException: Unexpected end of input stream at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:99) at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:87) at

execute hadoop job from remote web application

2011-10-18 Thread Oleg Ruchovets
Hi , what is the way to execute hadoop job on remote cluster. I want to execute my hadoop job from remote web application , but I didn't find any hadoop client (remote API) to do it. Please advice. Oleg

Re: execute hadoop job from remote web application

2011-10-18 Thread Oleg Ruchovets
Excellent. Can you give a small example of code. On Tue, Oct 18, 2011 at 1:13 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: - Original Message - From: Oleg Ruchovets oruchov...@gmail.com Date: Tuesday, October 18, 2011 4:11 pm Subject: execute hadoop job from remote

Re: execute hadoop job from remote web application

2011-10-18 Thread Oleg Ruchovets
:00 PM, Oleg Ruchovets oruchov...@gmail.comwrote: Excellent. Can you give a small example of code. Good samle by Bejoy hope, you have access for this site. Also please go through this docs, http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Example%3A+WordCount+v2.0 Here

Re: execute hadoop job from remote web application

2011-10-18 Thread Oleg Ruchovets
submission. You just need to remotely invoke the shellscript that contains the hadoop jar command with any required input arguments. Sorry if I'm not getting your requirement exactly. Regards Bejoy.K.S On Tue, Oct 18, 2011 at 6:29 PM, Oleg Ruchovets oruchov...@gmail.com wrote: Thanks you

Re: execute hadoop job from remote web application

2011-10-18 Thread Oleg Ruchovets
to get a string.). It would go through fine, and submit the job jar with classes included, over to the JobTracker. On Tue, Oct 18, 2011 at 9:13 PM, Oleg Ruchovets oruchov...@gmail.com wrote: I try to be more specific. It is not dependent jar. It is a jar which contains map/reduce/combine

get result parameters of finished hadoop job

2011-08-09 Thread Oleg Ruchovets
Hi , I want to get result information of finished hadoop job. Currently I can see it using hadoop admin web console. For example: Kind% CompleteNum TasksPendingRunningCompleteKilledFailed/Killed Task Attemptshttp://hadoop-master.infolinks.local:8021/jobfailures.jsp?jobid=job_201108071903_0036

Re: data and metadata in a different folders

2011-03-14 Thread Oleg Ruchovets
Subject: Re: data and metadata in a different folders To: common-user@hadoop.apache.org What type of metadata? Stuff like what's held by sequence files? Is it to be held for each file or per directory/set-of-files? On Mon, Mar 14, 2011 at 1:56 PM, Oleg Ruchovets oruchov...@gmail.com

hadoop infrastructure questions (production environment)

2011-02-08 Thread Oleg Ruchovets
Hi , we are going to production and have some questions to ask: We are using 0.20_append version (as I understand it is hbase 0.90 requirement). 1) Currently we have to process 50GB text files per day , it can grow to 150GB -- what is the best hadoop file size for our load and

java.io.IOException: Bad connect ack with firstBadLink

2010-11-09 Thread Oleg Ruchovets
Hi , running hadoop map/reduce got such exception? 1) Why does it happen? 2) Job didn't failed and continue it's execution? Does this exception cause losing data or map/reduce uses recovery mechanism? 2010-11-09 05:10:08,735 INFO org.apache.hadoop.hdfs.DFSClient: Exception in

Re: java.io.IOException: Bad connect ack with firstBadLink

2010-11-09 Thread Oleg Ruchovets
On Tue, Nov 9, 2010 at 12:58 PM, Oleg Ruchovets oruchov...@gmail.comwrote: Hi , running hadoop map/reduce got such exception? 1) Why does it happen? 2) Job didn't failed and continue it's execution? Does this exception cause losing data or map/reduce uses recovery mechanism? 2010-11

what does it mean -- java.io.IOException: Filesystem closed

2010-11-02 Thread Oleg Ruchovets
Hi , Running jadoop job from time to time I got such exception (from one of the reducers): The questions are : 1) What does this exception means for the data integrity? 2) Does it mean that part of the data which reducer responsible for (and got exception) are lost? 3) What could cause for such

write files to hdfs from osgi

2010-07-06 Thread Oleg Ruchovets
Hi , I need to write files to hdfs file system using hdfs java api . In my case I need that hdfs client runs from OSGi. 1) Does someone succeed to use hdfs client from OSGi container? What is the best practice ant potential problem of this issue? 1) As I understand at least

copy files to HDFS protocols

2010-06-30 Thread Oleg Ruchovets
Hi , I use HDFS Shell to copy files from local FileSystem to Hadoop HDFS. (copyFromLocal command). 1) How can I provide Path to file which locates on local FS but on different machine then hadoop locates? 2) What other protocols (way) can I use to write files to HDFS? Is it possible to

Re: the same key in different reducers

2010-06-10 Thread Oleg Ruchovets
description below, it is possible. On Wed, Jun 9, 2010 at 1:17 AM, Oleg Ruchovets oruchov...@gmail.com wrote: Hi , My hadoop job writes results of map/reduce to HBase. I have 3 reducers. Here is a sequence of input and output parameters for Mapper , Combiner and Reducer *input

the same key in different reducers

2010-06-09 Thread Oleg Ruchovets
Hi , My hadoop job writes results of map/reduce to HBase. I have 3 reducers. Here is a sequence of input and output parameters for Mapper , Combiner and Reducer *input: InputFormatK1,V1 mapper: MapperK1,V1,K2,V2 combiner: ReducerK2,V2,K2,V2 reducer: ReducerK2,V2,K3,V3 output:

best practice to work with resources

2010-05-25 Thread Oleg Ruchovets
Hi all. I have some data stored in properties file/s and need that hadoop job uses it. What is the best practice to work hadoop job with resources like properties files? Does hadoop provide infrastructure to work with properties files or there is another way to do it (like XML config files)?

execute mapreduce job on multiple hdfs files

2010-03-23 Thread Oleg Ruchovets
Hi , All examples that I found executes mapreduce job on a single file but in my situation I have more than one. Suppose I have such folder on HDFS which contains some files: /my_hadoop_hdfs/my_folder: /my_hadoop_hdfs/my_folder/file1.txt