Hey Stephen, thanks for the advice, but as i wrote in my first post, i
wanted to do that anyways. But thanks for the explanation why this is
indeed the best way to go for a production system ...
Cheers
Wolli
Am 27.03.14 16:05, schrieb Stephen Sprague:
fwiw. i would not have the repair table statement as part of a
production job stream. That's kinda a poor man's way to employ
dynamic partitioning off the back end.
Why not either use hive's dynamic partitioning features or pre-declare
your partitions? that way you are explicitly coding for your purpose
rather than running a general repair table on the backend knowing you
"broke it" up front?
just a suggestion!
On Thu, Mar 27, 2014 at 3:18 AM, fab wol <[email protected]
<mailto:[email protected]>> wrote:
Hey Nitin and everyone else,
so let me tell you from memory that the Hive CLI Error was kind of
the same and nothing saying like the beeline error. Would have
been no uplift here.
I was restarting the cluster (it is a cloud cluster provided by
http://www.unbelievable-machine.net), for getting the HiveServer2
Log and to be very sure, that everything is well set up. During
this all tasktrackers are deleted and newly setup (HDFS and
storage is not touched at all, neither are configs). After that
the msck repair table stmt is going well and its actually not so
slow at all, as i thought it might be (ca. 110 secs per table). I
guess there might have been some logs/tmp/cache data stacked up,
and that might have caused the errors ...
Slightly confusing, but i will post if I find out what exactly was
throwing the error here in the future ...
Cheers for the help
Wolli
2014-03-27 11:03 GMT+01:00 Nitin Pawar <[email protected]
<mailto:[email protected]>>:
Without error stack, very hard to get whats wrong
will it be possible for you to run it via hive cli and grab
some logs through there ?
On Thu, Mar 27, 2014 at 3:29 PM, fab wol
<[email protected] <mailto:[email protected]>> wrote:
Hey Nitin,
HiveServer2 Log unfurtantely says nothing:
Mon Mar 24 17:41:18 CET 2014 hiveserver2 stopped, pid 2540
Mon Mar 24 17:43:22 CET 2014 hiveserver2 started, pid 2554
Hive history
file=/tmp/mapr/hive_job_log_97715747-63cd-4789-9b2e-a8b0d544cdf9_2102956370
<tel:2102956370>.txt
OK
Thu Mar 27 10:52:48 CET 2014 hiveserver2 stopped, pid 2554
Thu Mar 27 10:55:52 CET 2014 hiveserver2 started, pid 2597
Cheers
Wolli
2014-03-27 10:04 GMT+01:00 Nitin Pawar
<[email protected] <mailto:[email protected]>>:
can you grab more logs from hiveserver2 log file?
On Thu, Mar 27, 2014 at 2:31 PM, fab wol
<[email protected] <mailto:[email protected]>>
wrote:
Hey everyone,
I have a table with currently 5541 partitions.
Daily there are 14 partitions added. I will switch
the update for the metastore from "msck repair
table" to "alter table add partition", since its
performing better, but sometimes this might fail,
and i need the "msck repair table" command. But
unfortunately its not working anymore with this
table size it seems:
0: jdbc:hive2://clusterXYZ-> use <DB_NAME>;
No rows affected (1.082 seconds)
0: jdbc:hive2://clusterXYZ-> set
hive.metastore.client.socket.timeout=6000;
No rows affected (0.029 seconds)
0: jdbc:hive2://clusterXYZ-> MSCK REPAIR TABLE
<TABLENAME>;
Error: Error while processing statement: FAILED:
Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask
(state=08S01,code=1)
Error: Error while processing statement: FAILED:
Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask
(state=08S01,code=1)
anyone had luck with getting this to work? As you
can see, I already raised the time until the
Thrift Timeout kicks in, but this error is
happening even before the time runs off ...
Cheers
Wolli
--
Nitin Pawar
--
Nitin Pawar