Thanks for your reply! Your comment made me realize that the table I was
trying to write onto didn't have any partition, while I was trying to write
in a specific partition:
val mapper: DelimitedRecordHiveMapper = new
DelimitedRecordHiveMapper().withColumnFields(new
Fields(colNames)).withTimeAsPartitionField("YYYY/MM/DD")
could that be the problem?
Anyway, I tried to comment out the withTimeAsPartionField and I am now
getting a totally different error, which could really be the actual issue
(as an attachment the complete stacktrace):
java.io.IOException: No FileSystem for scheme: hdfs
which makes me think I am bundling the wrong HDFS jar in the jar
application I'm building. Still, the version being bundled is hdfs 2.6.1,
while the version on the cluster is 2.7.3.2.5.5.0-157 (using HDP 2.5) which
shouldn't they be compatible?
Any suggestion?
2017-07-10 20:02 GMT+02:00 Eugene Koifman <[email protected]>:
> Are you able to write to Hive to an existing partition? (The stack trace
> shows that it’s being created)
>
>
>
>
>
> *From: *Federico D'Ambrosio <[email protected]>
> *Reply-To: *"[email protected]" <[email protected]>
> *Date: *Monday, July 10, 2017 at 7:38 AM
> *To: *"[email protected]" <[email protected]>, "[email protected]"
> <[email protected]>
> *Subject: *Non-local session path expected to be non-null trying to write
> on Hive using storm-hive
>
>
>
> Greetings,
>
> I'mtrying to get a working dataflow stack on a 6 node cluster (2 masters +
> 4 slaves, no Kerberos) using Kafka (2.10_0.10), Storm (1.0.1) and Hive2
> (1.2.1). Storm is able to communicate with Kafka, but can't seemingly
> operate on Hive (on master-1), even though it manages to connect to its
> metastore.
>
> I thought originally it was a problem of permissions on either HDFS or the
> local filesystem, but even though I set 777 permissions on /tmp/hive,
> there's still this issue.
>
> In core-site.xml:
>
> - hadoop.proxyuser.hcat.group
>
> · hadoop.proxyuser.hcat.hosts
>
> - hadoop.proxyuser.hdfs.groups
> - hadoop.proxyuser.hdfs.hosts
> - hadoop.proxyuser.hive.groups
> - hadoop.proxyuser.hive.hosts
> - hadoop.proxyuser.root.groups
> - hadoop.proxyuser.root.hosts
>
> are all set to '*'.
>
> Hive2, as far as I see is correctly set to work with transactions, being
> the target table with transactional=true, stored as orc and bucketed. In
> the hive-site.xml:
>
> - hive.compactor.worker.threads = 1
> - hive.compactor.initiator.on = true
> - hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
>
> I get a Nullpointer Exception, you may find the stack trace among the
> attached files.
>
> From what I can gather, the NullpointerException is thrown in the
> following method inside SessionState:
>
> 1. public static Path getHDFSSessionPath(Configuration conf) {
>
> 2. SessionState ss = SessionState.get();
>
> 3. if (ss == null) {
>
> 4. String sessionPathString = conf.get(HDFS_SESSION_PATH_KEY);
>
> 5. Preconditions.checkNotNull(sessionPathString, "Conf non-local
> session path expected to be non-null");
>
> 6. return new Path(sessionPathString);
>
> 7. }
>
> 8. Preconditions.checkNotNull(ss.hdfsSessionPath, "Non-local session path
> expected to be non-null");
>
> 9. return ss.hdfsSessionPath;
>
> 10.}
>
>
>
> Specifically, by:
>
> 1. Preconditions.checkNotNull(ss.hdfsSessionPath, "Non-local session path
> expected to be non-null");
>
> So, it seems to be an hdfs related issue, but I can't understand why it's
> happening.
>
> From what I gather, this occurs when Hive tries to retrieve the local path
> of the session, which is stored in the _hive.local.session.path
> configuration variable. The value of this variable is assigned each time a
> new Hive session is created, and it is formed by merging the path for user
> temporary files (hive.exec.local.scratchdir) to the session ID (
> hive.session.id).
>
> If indeed is a permissions issue, what should I look into to find the
> origin of the issue?
>
> Thanks for your help,
>
> Federico
>
--
Federico D'Ambrosio
2017-07-10 20:07:40.212 o.a.s.h.t.HiveState [INFO] Creating Writer to Hive end
point :
{metaStoreUri='thrift://master-1.localdomain:9083,thrift://master-2.localdomain:9083',
database='data_stream', table='air_traffic_test', partitionVals=[] }
2017-07-10 20:07:40.262 h.metastore [INFO] Trying to connect to metastore with
URI thrift://master-1.localdomain:9083
2017-07-10 20:07:40.273 h.metastore [INFO] Connected to metastore.
2017-07-10 20:07:40.290 h.metastore [INFO] Trying to connect to metastore with
URI thrift://master-1.localdomain:9083
2017-07-10 20:07:40.298 h.metastore [INFO] Connected to metastore.
2017-07-10 20:07:40.416 o.a.h.h.s.AbstractRecordWriter [ERROR] Failed creating
record updater
java.io.IOException: No FileSystem for scheme: hdfs
at
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
~[stormjar.jar:?]
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
~[stormjar.jar:?]
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
~[stormjar.jar:?]
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
~[stormjar.jar:?]
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
~[stormjar.jar:?]
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
~[stormjar.jar:?]
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
~[stormjar.jar:?]
at
org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.<init>(OrcRecordUpdater.java:215)
~[stormjar.jar:?]
at
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getRecordUpdater(OrcOutputFormat.java:282)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.AbstractRecordWriter.createRecordUpdater(AbstractRecordWriter.java:137)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.AbstractRecordWriter.newBatch(AbstractRecordWriter.java:117)
[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.DelimitedInputWriter.newBatch(DelimitedInputWriter.java:47)
[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:506)
[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:458)
[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatchImpl(HiveEndPoint.java:345)
[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatch(HiveEndPoint.java:325)
[stormjar.jar:?]
at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:256)
[stormjar.jar:?]
at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:253)
[stormjar.jar:?]
at org.apache.storm.hive.common.HiveWriter$9.call(HiveWriter.java:366)
[stormjar.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_77]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_77]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
2017-07-10 20:07:40.417 o.a.s.h.t.HiveState [ERROR] Failed to create HiveWriter
for endpoint:
{metaStoreUri='thrift://master-1.localdomain:9083,thrift://master-2.localdomain:9083',
database='data_stream', table='air_traffic_test', partitionVals=[] }
org.apache.storm.hive.common.HiveWriter$ConnectFailure: Failed connecting to
EndPoint
{metaStoreUri='thrift://master-1.localdomain:9083,thrift://master-2.localdomain:9083',
database='data_stream', table='air_traffic_test', partitionVals=[] }
at org.apache.storm.hive.common.HiveWriter.<init>(HiveWriter.java:80)
~[stormjar.jar:?]
at
org.apache.storm.hive.common.HiveUtils.makeHiveWriter(HiveUtils.java:50)
~[stormjar.jar:?]
at
org.apache.storm.hive.trident.HiveState.getOrCreateWriter(HiveState.java:206)
[stormjar.jar:?]
at
org.apache.storm.hive.trident.HiveState.writeTuples(HiveState.java:125)
[stormjar.jar:?]
at
org.apache.storm.hive.trident.HiveState.updateState(HiveState.java:112)
[stormjar.jar:?]
at
org.apache.storm.hive.trident.HiveUpdater.updateState(HiveUpdater.java:30)
[stormjar.jar:?]
at
org.apache.storm.hive.trident.HiveUpdater.updateState(HiveUpdater.java:27)
[stormjar.jar:?]
at
org.apache.storm.trident.planner.processor.PartitionPersistProcessor.finishBatch(PartitionPersistProcessor.java:98)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at
org.apache.storm.trident.planner.SubtopologyBolt.finishBatch(SubtopologyBolt.java:151)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at
org.apache.storm.trident.topology.TridentBoltExecutor.finishBatch(TridentBoltExecutor.java:266)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at
org.apache.storm.trident.topology.TridentBoltExecutor.checkFinish(TridentBoltExecutor.java:299)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at
org.apache.storm.trident.topology.TridentBoltExecutor.execute(TridentBoltExecutor.java:373)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at
org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at
org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at
org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at
org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at
org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484)
[storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
Caused by: org.apache.storm.hive.common.HiveWriter$TxnBatchFailure: Failed
acquiring Transaction Batch from EndPoint:
{metaStoreUri='thrift://master-1.localdomain:9083,thrift://master-2.localdomain:9083',
database='data_stream', table='air_traffic_test', partitionVals=[] }
at
org.apache.storm.hive.common.HiveWriter.nextTxnBatch(HiveWriter.java:264)
~[stormjar.jar:?]
at org.apache.storm.hive.common.HiveWriter.<init>(HiveWriter.java:72)
~[stormjar.jar:?]
... 21 more
Caused by: org.apache.hive.hcatalog.streaming.StreamingIOFailure: Unable to get
new record Updater
at
org.apache.hive.hcatalog.streaming.AbstractRecordWriter.newBatch(AbstractRecordWriter.java:120)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.DelimitedInputWriter.newBatch(DelimitedInputWriter.java:47)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:506)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:458)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatchImpl(HiveEndPoint.java:345)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatch(HiveEndPoint.java:325)
~[stormjar.jar:?]
at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:256)
~[stormjar.jar:?]
at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:253)
~[stormjar.jar:?]
at org.apache.storm.hive.common.HiveWriter$9.call(HiveWriter.java:366)
~[stormjar.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[?:1.8.0_77]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[?:1.8.0_77]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
~[?:1.8.0_77]
... 1 more
Caused by: java.io.IOException: No FileSystem for scheme: hdfs
at
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
~[stormjar.jar:?]
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
~[stormjar.jar:?]
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
~[stormjar.jar:?]
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
~[stormjar.jar:?]
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
~[stormjar.jar:?]
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
~[stormjar.jar:?]
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
~[stormjar.jar:?]
at
org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.<init>(OrcRecordUpdater.java:215)
~[stormjar.jar:?]
at
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getRecordUpdater(OrcOutputFormat.java:282)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.AbstractRecordWriter.createRecordUpdater(AbstractRecordWriter.java:137)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.AbstractRecordWriter.newBatch(AbstractRecordWriter.java:117)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.DelimitedInputWriter.newBatch(DelimitedInputWriter.java:47)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:506)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:458)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatchImpl(HiveEndPoint.java:345)
~[stormjar.jar:?]
at
org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatch(HiveEndPoint.java:325)
~[stormjar.jar:?]
at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:256)
~[stormjar.jar:?]
at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:253)
~[stormjar.jar:?]
at org.apache.storm.hive.common.HiveWriter$9.call(HiveWriter.java:366)
~[stormjar.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[?:1.8.0_77]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[?:1.8.0_77]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
~[?:1.8.0_77]
... 1 more
2017-07-10 20:07:40.417 o.a.s.h.t.HiveState [WARN] hive streaming failed.