Hi Costin
Sorry for the silence on this issue. This went a bit quiet.
But the good news is I've come back to it and managed to get it all working
with the new shark 0.9.1 release and
2.0.0RC1. Actually if I used ADD JAR I got the same exception but when I just
put the JAR into the shark lib/ folder it
worked fine (which seems to point to the classpath issue you mention).
However, I seem to have an issue with date <-> timestamp conversion.
I have a field in ES called "_ts" that has type "date" and the default format
"dateOptionalTime". When I do a query that
includes the timestamp it comes back NULL:
select ts from table ...
(note I use a correct es.mapping.names to map the _ts field in ES to ts field
in Hive/Shark that has timestamp type).
below is some of the debug-level output:
14/05/13 19:19:47 DEBUG lazy.LazyPrimitive: Data not in the TIMESTAMP data type
range so converted to null. Given data
is :96997506-06-30 19:08:168:16.768
14/05/13 19:19:47 DEBUG lazy.LazyPrimitive: Data not in the TIMESTAMP data type
range so converted to null. Given data
is :96997605-06-28 19:08:168:16.768
14/05/13 19:19:47 DEBUG lazy.LazyPrimitive: Data not in the TIMESTAMP data type
range so converted to null. Given data
is :96997624-06-28 19:08:168:16.768
14/05/13 19:19:47 DEBUG lazy.LazyPrimitive: Data not in the TIMESTAMP data type
range so converted to null. Given data
is :96997629-06-28 19:08:168:16.768
14/05/13 19:19:47 DEBUG lazy.LazyPrimitive: Data not in the TIMESTAMP data type
range so converted to null. Given data
is :96997634-06-29 19:08:168:16.768
NULL
NULL
NULL
NULL
NULL
The data that I index in the _ts field is timestamp in ms (long). It doesn't
seem to be converted correctly but the data
is correct (in ms at least) and I can query against it using date formats and
date math in ES.
Example snippet from debug log from above:
,"_ts":1397130475607}}]}}"
Any ideas or am I doing something silly?
I do see that the Hive timestamp expects either seconds since epoch of a
string-based format that has nanosecond
granularity. Is this the issue with just ms long timestamp data?
Thanks
Nick
On Thu, Mar 27, 2014 at 4:50 PM, Costin Leau <costin.l...@gmail.com
<mailto:costin.l...@gmail.com>> wrote:
Using the latest hive and hadoop is preferred as they contain various bug
fixes.
The error suggests a classpath issue - namely the same class is loaded
twice for some reason and hence the casting
fails.
Let's connect on IRC - give me a ping when you're available (user is
costin).
Cheers,
On 3/27/14 4:29 PM, Nick Pentreath wrote:
Thanks for the response.
I tried latest Shark (cdh4 version of 0.9.1 here
http://cloudera.rst.im/shark/ ) - this uses hadoop 1.0.4 and
hive 0.11
I believe, and build elasticsearch-hadoop from github master.
Still getting same error:
org.elasticsearch.hadoop.hive.__EsHiveInputFormat$EsHiveSplit cannot be
cast to
org.elasticsearch.hadoop.hive.__EsHiveInputFormat$EsHiveSplit
Will using hive 0.11 / hadoop 1.0.4 vs hive 0.12 / hadoop 1.2.1 in
es-hadoop master make a difference?
Anyone else actually got this working?
On Thu, Mar 20, 2014 at 2:44 PM, Costin Leau <costin.l...@gmail.com
<mailto:costin.l...@gmail.com>
<mailto:costin.l...@gmail.com <mailto:costin.l...@gmail.com>>__> wrote:
I recommend using master - there are several improvements done in
this area. Also using the latest Shark
(0.9.0) and
Hive (0.12) will help.
On 3/20/14 12:00 PM, Nick Pentreath wrote:
Hi
I am struggling to get this working too. I'm just trying
locally for now, running Shark 0.8.1, Hive
0.9.0 and ES
1.0.1
with ES-hadoop 1.3.0.M2.
I managed to get a basic example working with WRITING into an
index. But I'm really after READING and
index.
I believe I have set everything up correctly, I've added the
jar to Shark:
ADD JAR /path/to/es-hadoop.jar;
created a table:
CREATE EXTERNAL TABLE test_read (name string, price double)
STORED BY 'org.elasticsearch.hadoop.____hive.EsStorageHandler'
TBLPROPERTIES('es.resource' =
'test_index/test_type/_search?____q=*');
And then trying to 'SELECT * FROM test _read' gives me :
org.apache.spark.____SparkException: Job aborted: Task 3.0:0
failed more than 0 times; aborting job
java.lang.ClassCastException:
org.elasticsearch.hadoop.hive.____EsHiveInputFormat$__ESHiveSplit cannot
be cast to
org.elasticsearch.hadoop.hive.____EsHiveInputFormat$__ESHiveSplit
at
org.apache.spark.scheduler.____DAGScheduler$$anonfun$____abortStage$1.apply(____DAGScheduler.scala:827)
at
org.apache.spark.scheduler.____DAGScheduler$$anonfun$____abortStage$1.apply(____DAGScheduler.scala:825)
at
scala.collection.mutable.____ResizableArray$class.foreach(____ResizableArray.scala:60)
at
scala.collection.mutable.____ArrayBuffer.foreach(____ArrayBuffer.scala:47)
at
org.apache.spark.scheduler.____DAGScheduler.abortStage(____DAGScheduler.scala:825)
at
org.apache.spark.scheduler.____DAGScheduler.processEvent(____DAGScheduler.scala:440)
at org.apache.spark.scheduler.____DAGScheduler.org
<http://org.apache.spark.__scheduler.DAGScheduler.org
<http://org.apache.spark.scheduler.DAGScheduler.org>>$__apache$spark$__scheduler$__DAGScheduler$$run(____DAGScheduler.scala:502)
at
org.apache.spark.scheduler.____DAGScheduler$$anon$1.run(____DAGScheduler.scala:157)
FAILED: Execution Error, return code -101 from
shark.execution.SparkTask
In fact I get the same error thrown when trying to READ from
the table that I successfully WROTE to...
On Saturday, 22 February 2014 12:31:21 UTC+2, Costin Leau
wrote:
Yeah, it might have been some sort of network
configuration issue where services where running on
different
machines
and
localhost pointed to a different location.
Either way, I'm glad to hear things have are moving
forward.
Cheers,
On 22/02/2014 1:06 AM, Max Lang wrote:
> I managed to get it working on ec2 without issue this
time. I'd say the biggest difference was
that this
time I set up a
> dedicated ES machine. Is it possible that, because I
was using a cluster with slaves, when I used
"localhost" the slaves
> couldn't find the ES instance running on the master? Or
do all the requests go through the master?
>
>
> On Wednesday, February 19, 2014 2:35:40 PM UTC-8,
Costin Leau wrote:
>
> Hi,
>
> Setting logging in Hive/Hadoop can be tricky since
the log4j needs to be picked up by the
running JVM
otherwise you
> won't see anything.
> Take a look at this link on how to tell Hive to use
your logging settings [1].
>
> For the next release, we might introduce dedicated
exceptions for the simple fact that some
libraries, like Hive,
> swallow the stack trace and it's unclear what the
issue is which makes the exception
(IllegalStateException) ambiguous.
>
> Let me know how it goes and whether you will
encounter any issues with Shark. Or if you don't :)
>
> Thanks!
>
>
[1]https://cwiki.apache.org/____confluence/display/Hive/____GettingStarted#GettingStarted-____ErrorLogs
<https://cwiki.apache.org/__confluence/display/Hive/__GettingStarted#GettingStarted-__ErrorLogs>
<https://cwiki.apache.org/__confluence/display/Hive/__GettingStarted#GettingStarted-__ErrorLogs
<https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ErrorLogs>>
<https://cwiki.apache.org/____confluence/display/Hive/____GettingStarted#GettingStarted-____ErrorLogs
<https://cwiki.apache.org/__confluence/display/Hive/__GettingStarted#GettingStarted-__ErrorLogs>
<https://cwiki.apache.org/__confluence/display/Hive/__GettingStarted#GettingStarted-__ErrorLogs
<https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ErrorLogs>>>
>
<https://cwiki.apache.org/____confluence/display/Hive/____GettingStarted#GettingStarted-____ErrorLogs
<https://cwiki.apache.org/__confluence/display/Hive/__GettingStarted#GettingStarted-__ErrorLogs>
<https://cwiki.apache.org/__confluence/display/Hive/__GettingStarted#GettingStarted-__ErrorLogs
<https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ErrorLogs>>
<https://cwiki.apache.org/____confluence/display/Hive/____GettingStarted#GettingStarted-____ErrorLogs
<https://cwiki.apache.org/__confluence/display/Hive/__GettingStarted#GettingStarted-__ErrorLogs>
<https://cwiki.apache.org/__confluence/display/Hive/__GettingStarted#GettingStarted-__ErrorLogs
<https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ErrorLogs>>>>
>
> On 20/02/2014 12:02 AM, Max Lang wrote:
> > Hey Costin,
> >
> > Thanks for the swift reply. I abandoned EC2 to
take that out of the equation and managed
to get
everything working
> > locally using the latest version of everything
(though I realized just now I'm still on
hive 0.9).
I'm guessing you're
> > right about some port connection issue because I
definitely had ES running on that machine.
> >
> > I changed hive-log4j.properties and added
> > |
> > #custom logging levels
> > #log4j.logger.xxx=DEBUG
> > log4j.logger.org <http://log4j.logger.org>
<http://log4j.logger.org>.____elasticsearch.hadoop.rest=____TRACE
> >log4j.logger.org.__elasticsea__rch.hadoop.mr
<http://elasticsearch.hadoop.mr>
<http://log4j.logger.org.__elasticsearch.hadoop.mr
<http://log4j.logger.org.elasticsearch.hadoop.mr>>
<http://log4j.logger.org.__ela__sticsearch.hadoop.mr
<http://elasticsearch.hadoop.mr>
<http://log4j.logger.org.__elasticsearch.hadoop.mr
<http://log4j.logger.org.elasticsearch.hadoop.mr>>>
<http://log4j.logger.org.__ela__sticsearch.hadoop.mr
<http://elasticsearch.hadoop.mr>
<http://log4j.logger.org.__elasticsearch.hadoop.mr
<http://log4j.logger.org.elasticsearch.hadoop.mr>>
<http://log4j.logger.org.__ela__sticsearch.hadoop.mr
<http://elasticsearch.hadoop.mr>
<http://log4j.logger.org.__elasticsearch.hadoop.mr
<http://log4j.logger.org.elasticsearch.hadoop.mr>>>>=____TRACE
> > |
> >
> > But I didn't see any trace logging. Hopefully I
can get it working on EC2 without issue,
but, for
the future, is this
> > the correct way to set TRACE logging?
> >
> > Oh and, for reference, I tried running without ES
up and I got the following, exceptions:
> >
> > 2014-02-19 13:46:08,803 ERROR shark.SharkDriver
(Logging.scala:logError(64)) - FAILED: Hive
Internal Error:
> > java.lang.____IllegalStateException(Cannot
discover Elasticsearch version)
> > java.lang.____IllegalStateException: Cannot
discover Elasticsearch version
> > at
org.elasticsearch.hadoop.hive.____EsStorageHandler.init(____EsStorageHandler.java:101)
> > at
org.elasticsearch.hadoop.hive.____EsStorageHandler.____configureOutputJobProperties(____EsStorageHandler.java:83)
> > at
org.apache.hadoop.hive.ql.____plan.PlanUtils.____configureJobPropertiesForStora____geHandler(PlanUtils.java:__706)
> > at
org.apache.hadoop.hive.ql.____plan.PlanUtils.____configureOutputJobPropertiesFo____rStorageHandler(PlanUtils.____java:675)
> > at
org.apache.hadoop.hive.ql.____exec.FileSinkOperator.____augmentPlan(FileSinkOperator.____java:764)
> > at
org.apache.hadoop.hive.ql.____parse.SemanticAnalyzer.____putOpInsertMap(____SemanticAnalyzer.java:1518)
> > at
org.apache.hadoop.hive.ql.____parse.SemanticAnalyzer.____genFileSinkPlan(____SemanticAnalyzer.java:4337)
> > at
org.apache.hadoop.hive.ql.____parse.SemanticAnalyzer.____genPostGroupByBodyPlan(____SemanticAnalyzer.java:6207)
> > at
org.apache.hadoop.hive.ql.____parse.SemanticAnalyzer.____genBodyPlan(SemanticAnalyzer.____java:6138)
> > at
org.apache.hadoop.hive.ql.____parse.SemanticAnalyzer.____genPlan(SemanticAnalyzer.java:____6764)
> > at
shark.parse.____SharkSemanticAnalyzer.____analyzeInternal(____SharkSemanticAnalyzer.scala:____149)
> > at
org.apache.hadoop.hive.ql.____parse.BaseSemanticAnalyzer.____analyze(BaseSemanticAnalyzer.____java:244)
> > at
shark.SharkDriver.compile(____SharkDriver.scala:215)
> > at
org.apache.hadoop.hive.ql.____Driver.compile(Driver.java:____336)
> > at
org.apache.hadoop.hive.ql.____Driver.run(Driver.java:895)
> > at
shark.SharkCliDriver.____processCmd(SharkCliDriver.____scala:324)
> > at
org.apache.hadoop.hive.cli.____CliDriver.processLine(____CliDriver.java:406)
> > at
shark.SharkCliDriver$.main(____SharkCliDriver.scala:232)
> > at
shark.SharkCliDriver.main(____SharkCliDriver.scala)
> > Caused by: java.io.IOException: Out of nodes and
retries; caught exception
> > at
org.elasticsearch.hadoop.rest.____NetworkClient.execute(____NetworkClient.java:81)
> > at
org.elasticsearch.hadoop.rest.____RestClient.execute(__RestClient.__java:221)
> > at
org.elasticsearch.hadoop.rest.____RestClient.execute(__RestClient.__java:205)
> > at
org.elasticsearch.hadoop.rest.____RestClient.execute(__RestClient.__java:209)
> > at
org.elasticsearch.hadoop.rest.____RestClient.get(RestClient.____java:103)
> > at
org.elasticsearch.hadoop.rest.____RestClient.esVersion(____RestClient.java:274)
> > at
org.elasticsearch.hadoop.rest.____InitializationUtils.____discoverEsVersion(____InitializationUtils.java:84)
> > at
org.elasticsearch.hadoop.hive.____EsStorageHandler.init(____EsStorageHandler.java:99)
> > ... 18 more
> > Caused by: java.net.ConnectException: Connection
refused
> > at
java.net.PlainSocketImpl.____socketConnect(Native Method)
> > at java.net <http://java.net>
<http://java.net>.____AbstractPlainSocketImpl.____doConnect(____AbstractPlainSocketImpl.java:____339)
> > at java.net <http://java.net>
<http://java.net>.____AbstractPlainSocketImpl.____connectToAddress(____AbstractPlainSocketImpl.java:____200)
> > at java.net <http://java.net>
<http://java.net>.____AbstractPlainSocketImpl.____connect(____AbstractPlainSocketImpl.java:____182)
> > at
java.net.SocksSocketImpl.____connect(SocksSocketImpl.java:____391)
> > at java.net.Socket.connect(____Socket.java:579)
> > at java.net.Socket.connect(____Socket.java:528)
> > at java.net.Socket.<init>(Socket.____java:425)
> > at java.net.Socket.<init>(Socket.____java:280)
> > at
org.apache.commons.httpclient.____protocol.____DefaultProtocolSocketFactory.____createSocket(____DefaultProtocolSocketFactory.____java:80)
> > at
org.apache.commons.httpclient.____protocol.____DefaultProtocolSocketFactory.____createSocket(____DefaultProtocolSocketFactory.____java:122)
> > at
org.apache.commons.httpclient.____HttpConnection.open(____HttpConnection.java:707)
> > at
org.apache.commons.httpclient.____HttpMethodDirector.____executeWithRetry(____HttpMethodDirector.java:387)
> > at
org.apache.commons.httpclient.____HttpMethodDirector.____executeMethod(____HttpMethodDirector.java:171)
> > at
org.apache.commons.httpclient.____HttpClient.executeMethod(____HttpClient.java:397)
> > at
org.apache.commons.httpclient.____HttpClient.executeMethod(____HttpClient.java:323)
> > at
org.elasticsearch.hadoop.rest.____commonshttp.____CommonsHttpTransport.execute(____CommonsHttpTransport.java:__160)
> > at
org.elasticsearch.hadoop.rest.____NetworkClient.execute(____NetworkClient.java:74)
> > ... 25 more
> >
> > Let me know if there's anything in particular
you'd like me to try on EC2.
> >
> > (For posterity, the versions I used were: hadoop
2.2.0, hive 0.9.0, shark 8.1, spark 8.1,
es-hadoop
1.3.0.M2, java
> > 1.7.0_15, scala 2.9.3, elasticsearch 1.0.0)
> >
> > Thanks again,
> > Max
> >
> > On Tuesday, February 18, 2014 10:16:38 PM UTC-8,
Costin Leau wrote:
> >
> > The error indicates a network error - namely
es-hadoop cannot connect to Elasticsearch
on the
default (localhost:9200)
> > HTTP port. Can you double check whether
that's indeed the case (using curl or even
telnet on
that port) - maybe the
> > firewall prevents any connections to be
made...
> > Also you could try using the latest Hive,
0.12 and a more recent Hadoop such as 1.1.2
or 1.2.1.
> >
> > Additionally, can you enable TRACE logging in
your job on es-hadoop packages
org.elasticsearch.hadoop.rest and
> >org.elasticsearch.hadoop.mr
<http://org.elasticsearch.hadoop.mr>
<http://org.elasticsearch.__hadoop.mr
<http://org.elasticsearch.hadoop.mr>>
<http://org.elasticsearch.__ha__doop.mr <http://hadoop.mr>
<http://org.elasticsearch.__hadoop.mr
<http://org.elasticsearch.hadoop.mr>>>
<http://org.elasticsearch.__ha__doop.mr <http://hadoop.mr>
<http://org.elasticsearch.__hadoop.mr
<http://org.elasticsearch.hadoop.mr>>
<http://org.elasticsearch.__ha__doop.mr <http://hadoop.mr>
<http://org.elasticsearch.__hadoop.mr
<http://org.elasticsearch.hadoop.mr>>>>
<http://org.elasticsearch.__ha__doop.mr <http://hadoop.mr>
<http://org.elasticsearch.__hadoop.mr
<http://org.elasticsearch.hadoop.mr>> <http://org.elasticsearch.__ha__doop.mr
<http://hadoop.mr>
<http://org.elasticsearch.__hadoop.mr
<http://org.elasticsearch.hadoop.mr>>>
> <http://org.elasticsearch.__ha__doop.mr
<http://hadoop.mr>
<http://org.elasticsearch.__hadoop.mr
<http://org.elasticsearch.hadoop.mr>>
<http://org.elasticsearch.__ha__doop.mr <http://hadoop.mr>
<http://org.elasticsearch.__hadoop.mr
<http://org.elasticsearch.hadoop.mr>>>>> packages and report back ?
> >
> > Thanks,
> >
> > On 19/02/2014 4:03 AM, Max Lang wrote:
> > > I set everything up using this
guide:https://github.com/____amplab/shark/wiki/Running-____Shark-on-EC2
<https://github.com/__amplab/shark/wiki/Running-__Shark-on-EC2>
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2
<https://github.com/amplab/shark/wiki/Running-Shark-on-EC2>>
<https://github.com/amplab/____shark/wiki/Running-Shark-on-____EC2
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2>
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2
<https://github.com/amplab/shark/wiki/Running-Shark-on-EC2>>>
<https://github.com/amplab/____shark/wiki/Running-Shark-on-____EC2
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2>
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2
<https://github.com/amplab/shark/wiki/Running-Shark-on-EC2>>
<https://github.com/amplab/____shark/wiki/Running-Shark-on-____EC2
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2>
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2
<https://github.com/amplab/shark/wiki/Running-Shark-on-EC2>>>>
> >
<https://github.com/amplab/____shark/wiki/Running-Shark-on-____EC2
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2>
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2
<https://github.com/amplab/shark/wiki/Running-Shark-on-EC2>>
<https://github.com/amplab/____shark/wiki/Running-Shark-on-____EC2
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2>
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2
<https://github.com/amplab/shark/wiki/Running-Shark-on-EC2>>>
>
<https://github.com/amplab/____shark/wiki/Running-Shark-on-____EC2
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2>
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2
<https://github.com/amplab/shark/wiki/Running-Shark-on-EC2>>
<https://github.com/amplab/____shark/wiki/Running-Shark-on-____EC2
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2>
<https://github.com/amplab/__shark/wiki/Running-Shark-on-__EC2
<https://github.com/amplab/shark/wiki/Running-Shark-on-EC2>>>>> on an
ec2 cluster. I've
> > > copied the elasticsearch-hadoop jars into
the hive lib directory and I have
elasticsearch
running on localhost:9200. I'm
> > > running shark in a screen session with
--service screenserver and connecting to it
at the
same time using shark -h
> > > localhost.
> > >
> > > Unfortunately, when I attempt to write data
into elasticsearch, it fails. Here's an
example:
> > >
> > > |
> > > [localhost:10000]shark>CREATE EXTERNAL
TABLE wiki (id BIGINT,title STRING,last_modified
STRING,xml STRING,text
> > > STRING)ROW FORMAT DELIMITED FIELDS
TERMINATED BY '\t'LOCATION
's3n://spark-data/wikipedia-____sample/';
> > > Timetaken (including network
latency):0.159seconds
> > > 14/02/1901:23:33INFO CliDriver:Timetaken
(including network latency):0.159seconds
> > >
> > > [localhost:10000]shark>SELECT title FROM
wiki LIMIT 1;
> > > Alpokalja
> > > Timetaken (including network
latency):2.23seconds
> > > 14/02/1901:23:48INFO CliDriver:Timetaken
(including network latency):2.23seconds
> > >
> > > [localhost:10000]shark>CREATE EXTERNAL
TABLE es_wiki (id BIGINT,title
STRING,last_modified
STRING,xml STRING,text
> > > STRING)STORED BY
'org.elasticsearch.hadoop.____hive.EsStorageHandler'____TBLPROPERTIES('es.resource'='____wikipedia/article');
> > > Timetaken (including network
latency):0.061seconds
> > > 14/02/1901:33:51INFO CliDriver:Timetaken
(including network latency):0.061seconds
> > >
> > > [localhost:10000]shark>INSERT OVERWRITE
TABLE es_wiki SELECTw.id
<http://w.id>,w.title,w.last_____modified,w.xml,w.text FROM
wiki w;
> > > [HiveError]:Queryreturned non-zero
code:9,cause:FAILED:____ExecutionError,returncode
-101fromshark.execution.____SparkTask
> > > Timetaken (including network
latency):3.575seconds
> > > 14/02/1901:34:42INFO CliDriver:Timetaken
(including network latency):3.575seconds
> > > |
> > >
> > > *The stack trace looks like this:*
> > >
> > >
org.apache.hadoop.hive.ql.____metadata.HiveException
(org.apache.hadoop.hive.ql.____metadata.HiveException:
java.io.IOException:
> > > Out of nodes and retries; caught exception)
> > >
> > >
org.apache.hadoop.hive.ql.____exec.FileSinkOperator.____processOp(FileSinkOperator.____java:602)shark.execution.____FileSinkOperator$$anonfun$____processPartition$1.apply(____FileSinkOperator.scala:84)____shark.execution.____FileSinkOperator$$anonfun$____processPartition$1.apply(____FileSinkOperator.scala:81)____scala.collection.Iterator$____class.foreach(Iterator.scala:____772)scala.collection.__Iterator$__$anon$19.foreach(__Iterator.__scala:399)shark.__execution.__FileSinkOperator.____processPartition(____FileSinkOperator.scala:81)____shark.execution.____FileSinkOperator$.writeFiles$____1(FileSinkOperator.scala:207)____shark.execution.____FileSinkOperator$$anonfun$____executeProcessFileSinkPartitio____n$1.apply(FileSinkOperator.____scala:211)shark.execution.____FileSinkOperator$$anonfun$____executeProcessFileSinkPartitio____n$1.apply(FileSinkOperator.____scala:211)org.apache.spark.____scheduler.ResultTask.runTask(____ResultTask.scala:107)org.____apache.spark.scheduler.Ta