Re: sqooping into S3

Venkat Ranganathan Tue, 04 Feb 2014 14:37:10 -0800

If you see the error, you can see that FS object being referenced is an
HDFS location which is not valid as you have an S3 filesystem as source of
data.


I dont know what your intention is.   You are saying hive import from MYSQL
to S3.    Do you mean Sqoop import?  You just want the files to land on S3?
 Then you don't need the --hive-import and --hive-overwrite options.

To do hive import with hive from S3 file, you probably have to make the
warehouse dir to be on S3.

You can also create an external table in Hive after the data lands on S3

Venkat


On Tue, Feb 4, 2014 at 1:50 PM, Imran Akbar <[email protected]> wrote:

> That doesn't seem to be the issue, because I just manually created a
> folder called "_logs" in S3 and it worked.
> Any ideas why the sqoop import would work, but would fail when trying to
> create a "_logs" folder after its done?
>
>
> On Tue, Feb 4, 2014 at 1:44 PM, Imran Akbar <[email protected]>wrote:
>
>> Hey Venkat,
>>     Sorry, I meant to say I made that change in core-site.xml, not
>> site-core.xml.
>>
>> I'm trying to do a hive import from MySQL to S3, but I think the error is
>> popping up because sqoop is trying to create a "_logs" directory, but
>> according to S3's naming conventions you can't start the name of a bucket
>> with an underscore:
>>
>> "Bucket names can contain lowercase letters, numbers, and dashes. Each
>> label must start and end with a lowercase letter or a number."
>> http://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html
>>
>> this is the error i'm getting (the iakbar.emr/dump2/ location on S3
>> contains files, so I know sqoop works up to this point):
>> "This file system object (hdfs://10.202.163.18:9000) does not support
>> access to the request path 's3n://****:****@iakbar.emr/dump2/_logs'"
>>
>> thanks,
>> imran
>>
>>
>> On Tue, Feb 4, 2014 at 12:45 PM, Venkat Ranganathan <
>> [email protected]> wrote:
>>
>>> I think you are trying to do a hive import from the S3 location.    I
>>> think it may not be supported - As Jarcec said you may want to change the
>>> core-site to point to S3 on your Hadoop cluster.  But I have not  tested
>>> this so not sure if that will work
>>>
>>> Venkat
>>>
>>>
>>> On Tue, Feb 4, 2014 at 12:04 PM, Imran Akbar <[email protected]>wrote:
>>>
>>>> I think it may have worked, but I am getting an error.
>>>>
>>>> I added this line to site-core.xml:
>>>> <property><name>fs.defaultFS</name><value>s3n</value></property>
>>>>
>>>> and I see the following contents in my S3 directory after running sqoop:
>>>> _SUCCESS
>>>> part-m-00000
>>>> part-m-00001
>>>> part-m-00002
>>>> part-m-00003
>>>> part-m-00004
>>>> part-m-00005
>>>>
>>>> I'm running sqoop version 1.4.4.
>>>>
>>>> But I still get this error after running sqoop:
>>>> http://pastebin.com/5AYCsd78
>>>>
>>>> any ideas?
>>>> thanks for the help so far
>>>>
>>>> imran
>>>>
>>>>
>>>> On Tue, Feb 4, 2014 at 11:24 AM, Venkat Ranganathan <
>>>> [email protected]> wrote:
>>>>
>>>>> Which version of sqoop are you using.   Sqoop 1.4.4 addressed use of
>>>>> other filesystems with the fix mentioned in SQOOP-1033
>>>>>
>>>>> Thanks
>>>>> Venkat
>>>>>
>>>>>
>>>>> On Tue, Feb 4, 2014 at 8:14 AM, Jarek Jarcec Cecho 
>>>>> <[email protected]>wrote:
>>>>>
>>>>>> Yes Imran,
>>>>>> I would try to define the fs.defaultFS for the S3 in core-site.xml
>>>>>> and see if it will help Sqoop to accept the S3 path.
>>>>>>
>>>>>> Jarcec
>>>>>>
>>>>>> On Tue, Feb 04, 2014 at 08:08:17AM -0800, Imran Akbar wrote:
>>>>>> > thanks Jarek,
>>>>>> >    How would I do that?  Do I need to set fs.defaultFS in
>>>>>> core-site.xml, or
>>>>>> > is it something else?  Is there a document somewhere which
>>>>>> describes this?
>>>>>> >
>>>>>> > yours,
>>>>>> > imran
>>>>>> >
>>>>>> >
>>>>>> > On Mon, Feb 3, 2014 at 9:31 PM, Jarek Jarcec Cecho <
>>>>>> [email protected]>wrote:
>>>>>> >
>>>>>> > > Would you mind trying to set the S3 filesystem as the default one
>>>>>> for
>>>>>> > > Sqoop?
>>>>>> > >
>>>>>> > > Jarcec
>>>>>> > >
>>>>>> > > On Mon, Feb 03, 2014 at 10:25:50AM -0800, Imran Akbar wrote:
>>>>>> > > > Hi,
>>>>>> > > >     I've been able to sqoop from MySQL into HDFS, but I was
>>>>>> wondering if
>>>>>> > > it
>>>>>> > > > was possible to send the data directly to S3 instead.  I've
>>>>>> read some
>>>>>> > > posts
>>>>>> > > > on this forum and others that indicate that it's not possible
>>>>>> to do this
>>>>>> > > -
>>>>>> > > > could someone confirm?
>>>>>> > > >
>>>>>> > > > I tried to get it to work by setting:
>>>>>> > > > --warehouse-dir s3n://MYS3APIKEY:MYS3SECRETKEY@bucketname
>>>>>> /folder/
>>>>>> > > > or
>>>>>> > > > --target-dir s3n://MYS3APIKEY:MYS3SECRETKEY@bucketname/folder/
>>>>>> > > >
>>>>>> > > > options but I get the error:
>>>>>> > > > ERROR tool.ImportTool: Imported Failed: This file system object
>>>>>> (hdfs://
>>>>>> > > > 10.168.22.133:9000) does not support access to the request path
>>>>>> > > > 's3n://****:****@iakbar.emr/new-hive-output/_logs' You possibly
>>>>>> called
>>>>>> > > > FileSystem.get(conf) when you should have called
>>>>>> FileSystem.get(uri,
>>>>>> > > conf)
>>>>>> > > > to obtain a file system supporting your path
>>>>>> > > >
>>>>>> > > > If it's not possible to do this, should I just import to HDFS
>>>>>> and then
>>>>>> > > > output to S3?  Is there an easy way to do this without having
>>>>>> to specify
>>>>>> > > > the schema of the whole table again?
>>>>>> > > >
>>>>>> > > > thanks,
>>>>>> > > > imran
>>>>>> > >
>>>>>>
>>>>>
>>>>>
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or
>>>>> entity to which it is addressed and may contain information that is
>>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>>> If the reader of this message is not the intended recipient, you are 
>>>>> hereby
>>>>> notified that any printing, copying, dissemination, distribution,
>>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>>> you have received this communication in error, please contact the sender
>>>>> immediately and delete it from your system. Thank You.
>>>>
>>>>
>>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>>
>>
>>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: sqooping into S3

Reply via email to