Re: sqooping into S3

Imran Akbar Tue, 04 Feb 2014 13:52:06 -0800

That doesn't seem to be the issue, because I just manually created a folder
called "_logs" in S3 and it worked.
Any ideas why the sqoop import would work, but would fail when trying to
create a "_logs" folder after its done?



On Tue, Feb 4, 2014 at 1:44 PM, Imran Akbar <[email protected]> wrote:

> Hey Venkat,
>     Sorry, I meant to say I made that change in core-site.xml, not
> site-core.xml.
>
> I'm trying to do a hive import from MySQL to S3, but I think the error is
> popping up because sqoop is trying to create a "_logs" directory, but
> according to S3's naming conventions you can't start the name of a bucket
> with an underscore:
>
> "Bucket names can contain lowercase letters, numbers, and dashes. Each
> label must start and end with a lowercase letter or a number."
> http://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html
>
> this is the error i'm getting (the iakbar.emr/dump2/ location on S3
> contains files, so I know sqoop works up to this point):
> "This file system object (hdfs://10.202.163.18:9000) does not support
> access to the request path 's3n://****:****@iakbar.emr/dump2/_logs'"
>
> thanks,
> imran
>
>
> On Tue, Feb 4, 2014 at 12:45 PM, Venkat Ranganathan <
> [email protected]> wrote:
>
>> I think you are trying to do a hive import from the S3 location.    I
>> think it may not be supported - As Jarcec said you may want to change the
>> core-site to point to S3 on your Hadoop cluster.  But I have not  tested
>> this so not sure if that will work
>>
>> Venkat
>>
>>
>> On Tue, Feb 4, 2014 at 12:04 PM, Imran Akbar <[email protected]>wrote:
>>
>>> I think it may have worked, but I am getting an error.
>>>
>>> I added this line to site-core.xml:
>>> <property><name>fs.defaultFS</name><value>s3n</value></property>
>>>
>>> and I see the following contents in my S3 directory after running sqoop:
>>> _SUCCESS
>>> part-m-00000
>>> part-m-00001
>>> part-m-00002
>>> part-m-00003
>>> part-m-00004
>>> part-m-00005
>>>
>>> I'm running sqoop version 1.4.4.
>>>
>>> But I still get this error after running sqoop:
>>> http://pastebin.com/5AYCsd78
>>>
>>> any ideas?
>>> thanks for the help so far
>>>
>>> imran
>>>
>>>
>>> On Tue, Feb 4, 2014 at 11:24 AM, Venkat Ranganathan <
>>> [email protected]> wrote:
>>>
>>>> Which version of sqoop are you using.   Sqoop 1.4.4 addressed use of
>>>> other filesystems with the fix mentioned in SQOOP-1033
>>>>
>>>> Thanks
>>>> Venkat
>>>>
>>>>
>>>> On Tue, Feb 4, 2014 at 8:14 AM, Jarek Jarcec Cecho 
>>>> <[email protected]>wrote:
>>>>
>>>>> Yes Imran,
>>>>> I would try to define the fs.defaultFS for the S3 in core-site.xml and
>>>>> see if it will help Sqoop to accept the S3 path.
>>>>>
>>>>> Jarcec
>>>>>
>>>>> On Tue, Feb 04, 2014 at 08:08:17AM -0800, Imran Akbar wrote:
>>>>> > thanks Jarek,
>>>>> >    How would I do that?  Do I need to set fs.defaultFS in
>>>>> core-site.xml, or
>>>>> > is it something else?  Is there a document somewhere which describes
>>>>> this?
>>>>> >
>>>>> > yours,
>>>>> > imran
>>>>> >
>>>>> >
>>>>> > On Mon, Feb 3, 2014 at 9:31 PM, Jarek Jarcec Cecho <
>>>>> [email protected]>wrote:
>>>>> >
>>>>> > > Would you mind trying to set the S3 filesystem as the default one
>>>>> for
>>>>> > > Sqoop?
>>>>> > >
>>>>> > > Jarcec
>>>>> > >
>>>>> > > On Mon, Feb 03, 2014 at 10:25:50AM -0800, Imran Akbar wrote:
>>>>> > > > Hi,
>>>>> > > >     I've been able to sqoop from MySQL into HDFS, but I was
>>>>> wondering if
>>>>> > > it
>>>>> > > > was possible to send the data directly to S3 instead.  I've read
>>>>> some
>>>>> > > posts
>>>>> > > > on this forum and others that indicate that it's not possible to
>>>>> do this
>>>>> > > -
>>>>> > > > could someone confirm?
>>>>> > > >
>>>>> > > > I tried to get it to work by setting:
>>>>> > > > --warehouse-dir s3n://MYS3APIKEY:MYS3SECRETKEY@bucketname
>>>>> /folder/
>>>>> > > > or
>>>>> > > > --target-dir s3n://MYS3APIKEY:MYS3SECRETKEY@bucketname/folder/
>>>>> > > >
>>>>> > > > options but I get the error:
>>>>> > > > ERROR tool.ImportTool: Imported Failed: This file system object
>>>>> (hdfs://
>>>>> > > > 10.168.22.133:9000) does not support access to the request path
>>>>> > > > 's3n://****:****@iakbar.emr/new-hive-output/_logs' You possibly
>>>>> called
>>>>> > > > FileSystem.get(conf) when you should have called
>>>>> FileSystem.get(uri,
>>>>> > > conf)
>>>>> > > > to obtain a file system supporting your path
>>>>> > > >
>>>>> > > > If it's not possible to do this, should I just import to HDFS
>>>>> and then
>>>>> > > > output to S3?  Is there an easy way to do this without having to
>>>>> specify
>>>>> > > > the schema of the whole table again?
>>>>> > > >
>>>>> > > > thanks,
>>>>> > > > imran
>>>>> > >
>>>>>
>>>>
>>>>
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>>>> entity to which it is addressed and may contain information that is
>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>> If the reader of this message is not the intended recipient, you are hereby
>>>> notified that any printing, copying, dissemination, distribution,
>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>> you have received this communication in error, please contact the sender
>>>> immediately and delete it from your system. Thank You.
>>>
>>>
>>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>
>

Re: sqooping into S3

Reply via email to