Re: JetS3T settings spark

2014-12-30 Thread durga katakam
Thanks Matei.
-D

On Tue, Dec 30, 2014 at 4:49 PM, Matei Zaharia 
wrote:

> This file needs to be on your CLASSPATH actually, not just in a directory.
> The best way to pass it in is probably to package it into your application
> JAR. You can put it in src/main/resources in a Maven or SBT project, and
> check that it makes it into the JAR using jar tf yourfile.jar.
>
> Matei
>
> > On Dec 30, 2014, at 4:21 PM, durga  wrote:
> >
> > I am not sure , the way I can pass jets3t.properties file for
> spark-submit.
> > --file option seems not working.
> > can some one please help me. My production spark jobs get hung up when
> > reading s3 file sporadically.
> >
> > Thanks,
> > -D
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/JetS3T-settings-spark-tp20916.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> > For additional commands, e-mail: user-h...@spark.apache.org
> >
>
>


Re: S3 files , Spark job hungsup

2014-12-22 Thread durga katakam
Yes . I am reading thousands of files every hours. Is there any way I can
tell spark to timeout.
Thanks for your help.

-D

On Mon, Dec 22, 2014 at 4:57 AM, Shuai Zheng  wrote:

> Is it possible too many connections open to read from s3 from one node? I
> have this issue before because I open a few hundreds of files on s3 to read
> from one node. It just block itself without error until timeout later.
>
> On Monday, December 22, 2014, durga  wrote:
>
>> Hi All,
>>
>> I am facing a strange issue sporadically. occasionally my spark job is
>> hungup on reading s3 files. It is not throwing exception . or making some
>> progress, it is just hungs up there.
>>
>> Is this a known issue , Please let me know how could I solve this issue.
>>
>> Thanks,
>> -D
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/S3-files-Spark-job-hungsup-tp20806.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>


Re: S3 globbing

2014-12-17 Thread durga katakam
Hi Akhil,

Thanks for your time. I appreciate .I tried this approach , but either I am
getting less files or more files not exact hour files.

Is there any way I can tell the range (between this time to this time)

Thanks,
D

On Tue, Dec 16, 2014 at 11:04 PM, Akhil Das 
wrote:
>
> Did you try something like:
>
> //Get the last hour
> val d = (System.currentTimeMillis() - 3600 * 1000)
> val ex = "abc_" + d.toString().substring(0,7) + "*.json"
>
>
> [image: Inline image 1]
>
> Thanks
> Best Regards
>
> On Wed, Dec 17, 2014 at 5:05 AM, durga  wrote:
>>
>> Hi All,
>>
>> I need help with regex in my sc.textFile()
>>
>> I have lots of files with with epoch millisecond timestamp.
>>
>> ex:abc_1418759383723.json
>>
>> Now I need to consume last one hour files using the epoch time stamp as
>> mentioned above.
>>
>> I tried couple of options , nothing seems working for me.
>>
>> If any one of you face this issue and got a solution , please help me.
>>
>> Appreciating your help,
>>
>> Thanks,
>> D
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/S3-globbing-tp20731.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>