fwiw. i would not have the repair table statement as part of a production job stream. That's kinda a poor man's way to employ dynamic partitioning off the back end.
Why not either use hive's dynamic partitioning features or pre-declare your partitions? that way you are explicitly coding for your purpose rather than running a general repair table on the backend knowing you "broke it" up front? just a suggestion! On Thu, Mar 27, 2014 at 3:18 AM, fab wol <[email protected]> wrote: > Hey Nitin and everyone else, > > so let me tell you from memory that the Hive CLI Error was kind of the > same and nothing saying like the beeline error. Would have been no uplift > here. > > I was restarting the cluster (it is a cloud cluster provided by > http://www.unbelievable-machine.net), for getting the HiveServer2 Log and > to be very sure, that everything is well set up. During this all > tasktrackers are deleted and newly setup (HDFS and storage is not touched > at all, neither are configs). After that the msck repair table stmt is > going well and its actually not so slow at all, as i thought it might be > (ca. 110 secs per table). I guess there might have been some logs/tmp/cache > data stacked up, and that might have caused the errors ... > > Slightly confusing, but i will post if I find out what exactly was > throwing the error here in the future ... > > Cheers for the help > Wolli > > > 2014-03-27 11:03 GMT+01:00 Nitin Pawar <[email protected]>: > > Without error stack, very hard to get whats wrong >> >> will it be possible for you to run it via hive cli and grab some logs >> through there ? >> >> >> On Thu, Mar 27, 2014 at 3:29 PM, fab wol <[email protected]> wrote: >> >>> Hey Nitin, >>> >>> HiveServer2 Log unfurtantely says nothing: >>> >>> Mon Mar 24 17:41:18 CET 2014 hiveserver2 stopped, pid 2540 >>> Mon Mar 24 17:43:22 CET 2014 hiveserver2 started, pid 2554 >>> Hive history >>> file=/tmp/mapr/hive_job_log_97715747-63cd-4789-9b2e-a8b0d544cdf9_ >>> 2102956370.txt >>> OK >>> Thu Mar 27 10:52:48 CET 2014 hiveserver2 stopped, pid 2554 >>> Thu Mar 27 10:55:52 CET 2014 hiveserver2 started, pid 2597 >>> >>> Cheers >>> Wolli >>> >>> >>> 2014-03-27 10:04 GMT+01:00 Nitin Pawar <[email protected]>: >>> >>> can you grab more logs from hiveserver2 log file? >>>> >>>> >>>> On Thu, Mar 27, 2014 at 2:31 PM, fab wol <[email protected]> wrote: >>>> >>>>> Hey everyone, >>>>> >>>>> I have a table with currently 5541 partitions. Daily there are 14 >>>>> partitions added. I will switch the update for the metastore from "msck >>>>> repair table" to "alter table add partition", since its performing better, >>>>> but sometimes this might fail, and i need the "msck repair table" command. >>>>> But unfortunately its not working anymore with this table size it seems: >>>>> >>>>> 0: jdbc:hive2://clusterXYZ-> use <DB_NAME>; >>>>> No rows affected (1.082 seconds) >>>>> 0: jdbc:hive2://clusterXYZ-> set >>>>> hive.metastore.client.socket.timeout=6000; >>>>> No rows affected (0.029 seconds) >>>>> 0: jdbc:hive2://clusterXYZ-> MSCK REPAIR TABLE <TABLENAME>; >>>>> Error: Error while processing statement: FAILED: Execution Error, >>>>> return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask >>>>> (state=08S01,code=1) >>>>> Error: Error while processing statement: FAILED: Execution Error, >>>>> return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask >>>>> (state=08S01,code=1) >>>>> >>>>> anyone had luck with getting this to work? As you can see, I already >>>>> raised the time until the Thrift Timeout kicks in, but this error is >>>>> happening even before the time runs off ... >>>>> >>>>> Cheers >>>>> Wolli >>>>> >>>> >>>> >>>> >>>> -- >>>> Nitin Pawar >>>> >>> >>> >> >> >> -- >> Nitin Pawar >> > >
