Re: StoreInKiteDataset help

Christopher Wilson Fri, 16 Oct 2015 09:49:59 -0700

Use the example from the Kite web site.

http://kitesdk.org/docs/1.1.0/Install-Kite.html


http://kitesdk.org/docs/1.1.0/Using-the-Kite-CLI-to-Create-a-Dataset.html

Sorry for not being clear and thanks for the help.

-Chris


On Fri, Oct 16, 2015 at 11:19 AM, Oleg Zhurakousky <
ozhurakou...@hortonworks.com> wrote:

> Chris
>
> Could you elaborate on your use case a bit more? Specifically about where
> is the source of data you want to pump into hive (e.g., Streaming, bulk
> file load etc.)
>
> Cheers
> Oleg
>
> On Oct 16, 2015, at 8:56 AM, Christopher Wilson <wilson...@gmail.com>
> wrote:
>
> Joe, it was an HDP issue.  I didn't leap to NiFi if the examples didn't
> work.  Thanks again.
>
> Also, if there's a better way to pump data into Hive I'm all ears.
>
> -Chris
>
> On Fri, Oct 16, 2015 at 8:53 AM, Christopher Wilson <wilson...@gmail.com>
> wrote:
>
>> Joe, the first hurdle is to get ojdbc6.jar downloaded and installed in
>> /usr/share/java.  There's a link created in /usr/hdp/2.3.0.0-2557/hive/lib/
>> but points to nothing.
>>
>> Here's the hurdle I can't get past.  If you install and run kite-dataset
>> from the web site and run through the example with debug and verbose turned
>> on (below) you get the output below.  It thinks mapreduce.tar.gz doesn't
>> exist, but it does (way down below).  I've run this as users root and hdfs
>> with no joy.  Thanks for looking.
>>
>> debug=true ./kite-dataset -v csv-import sandwiches.csv sandwiches
>>
>> WARNING: Use "yarn jar" to launch YARN applications.
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in
>> [jar:file:/usr/hdp/2.3.0.0-2557/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/usr/hdp/2.3.0.0-2557/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> 1 job failure(s) occurred:
>> org.kitesdk.tools.CopyTask:
>> Kite(dataset:file:/tmp/0c1454eb-7831-4d6b-85a2-63a6cc8c51... ID=1 (1/1)(1):
>> java.io.FileNotFoundException: File
>> file:/hdp/apps/2.3.0.0-2557/mapreduce/mapreduce.tar.gz does not exist
>>     at
>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606)
>>     at
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819)
>>     at
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596)
>>     at
>> org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:110)
>>     at
>> org.apache.hadoop.fs.AbstractFileSystem.resolvePath(AbstractFileSystem.java:467)
>>     at org.apache.hadoop.fs.FilterFs.resolvePath(FilterFs.java:157)
>>     at org.apache.hadoop.fs.FileContext$25.next(FileContext.java:2193)
>>     at org.apache.hadoop.fs.FileContext$25.next(FileContext.java:2189)
>>     at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>>     at org.apache.hadoop.fs.FileContext.resolve(FileContext.java:2189)
>>     at org.apache.hadoop.fs.FileContext.resolvePath(FileContext.java:601)
>>     at
>> org.apache.hadoop.mapreduce.JobSubmitter.addMRFrameworkToDistributedCache(JobSubmitter.java:457)
>>     at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:142)
>>     at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
>>     at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at javax.security.auth.Subject.doAs(Subject.java:415)
>>     at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>>     at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
>>     at
>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:329)
>>     at
>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:204)
>>     at
>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:238)
>>     at
>> org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:112)
>>     at
>> org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:55)
>>     at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:83)
>>     at java.lang.Thread.run(Thread.java:745)
>>
>> [hdfs@sandbox ~]$ hdfs dfs -ls /hdp/apps/2.3.0.0-2557/mapreduce
>> Found 2 items
>> -r--r--r--   1 hdfs hadoop     105893 2015-08-20 08:36
>> /hdp/apps/2.3.0.0-2557/mapreduce/hadoop-streaming.jar
>> -r--r--r--   1 hdfs hadoop  207888607 2015-08-20 08:33
>> /hdp/apps/2.3.0.0-2557/mapreduce/mapreduce.tar.gz
>>
>>
>> On Thu, Oct 15, 2015 at 3:22 PM, Joe Witt <joe.w...@gmail.com> wrote:
>>
>>> Chris,
>>>
>>> Are you seeing errors in NiFi or in HDP?  If you're seeing errors in
>>> NiFi can you please send us the logs?
>>>
>>> Thanks
>>> Joe
>>>
>>> On Thu, Oct 15, 2015 at 3:02 PM, Christopher Wilson <wilson...@gmail.com>
>>> wrote:
>>> > Has anyone gotten Kite to work on HDP?  I'd wanted to do this very
>>> thing but
>>> > am running into all kinds of issues with having .jar files not in the
>>> > distributed cache (basically in /apps/hdp).
>>> >
>>> > Any feedback appreciated.
>>> >
>>> > -Chris
>>> >
>>> > On Sat, Sep 19, 2015 at 11:04 AM, Tyler Hawkes <tyler.haw...@gmail.com
>>> >
>>> > wrote:
>>> >>
>>> >> Thanks for the link. I'm using
>>> >> "dataset:hive://hadoop01:9083/default/sandwiches". hadoop01 has hive
>>> on it.
>>> >>
>>> >> On Fri, Sep 18, 2015 at 7:36 AM Jeff <j.007...@gmail.com> wrote:
>>> >>>
>>> >>> Not sure if this is what you are looking for but it has a bit on
>>> kite.
>>> >>>
>>> >>> http://ingest.tips/2014/12/22/getting-started-with-apache-nifi/
>>> >>>
>>> >>> -cb
>>> >>>
>>> >>>
>>> >>> On Sep 18, 2015, at 8:32 AM, Bryan Bende <bbe...@gmail.com> wrote:
>>> >>>
>>> >>> Hi Tyler,
>>> >>>
>>> >>> Unfortunately I don't think there are any tutorials on this. Can you
>>> >>> provide an example of the dataset uri you specified that is showing
>>> as
>>> >>> invalid?
>>> >>>
>>> >>> Thanks,
>>> >>>
>>> >>> Bryan
>>> >>>
>>> >>> On Fri, Sep 18, 2015 at 12:36 AM, Tyler Hawkes <
>>> tyler.haw...@gmail.com>
>>> >>> wrote:
>>> >>>>
>>> >>>> I'm just getting going on NiFi and trying to write data to Hive
>>> either
>>> >>>> from Kafka or an RDBMS. After setting up the hadoop configuration
>>> files and
>>> >>>> a target dataset uri it says the uri is invalid. I'm wondering if
>>> there's a
>>> >>>> tutorial on getting kite set up with my version of hive (HDP 2.2
>>> running
>>> >>>> hive 0.14) and nifi since I've been unable to find anything on
>>> google or on
>>> >>>> the mailing list archive and the documentation of
>>> StoreInKiteDataset it
>>> >>>> lacking a lot of detail.
>>> >>>>
>>> >>>> Any help on this would be greatly appreciated.
>>> >>>
>>> >>>
>>> >>>
>>> >
>>>
>>
>>
>
>

Re: StoreInKiteDataset help

Reply via email to