Re: MSCK REPAIR TABLE

fab wol Thu, 27 Mar 2014 15:37:28 -0700

Hey Stephen, thanks for the advice, but as i wrote in my first post, iwanted to do that anyways. But thanks for the explanation why this isindeed the best way to go for a production system ...


Cheers
Wolli


Am 27.03.14 16:05, schrieb Stephen Sprague:

fwiw. i would not have the repair table statement as part of aproduction job stream. That's kinda a poor man's way to employdynamic partitioning off the back end.

Why not either use hive's dynamic partitioning features or pre-declareyour partitions? that way you are explicitly coding for your purposerather than running a general repair table on the backend knowing you"broke it" up front?


just a suggestion!

On Thu, Mar 27, 2014 at 3:18 AM, fab wol <[email protected]<mailto:[email protected]>> wrote:


    Hey Nitin and everyone else,

    so let me tell you from memory that the Hive CLI Error was kind of
    the same and nothing saying like the beeline error. Would have
    been no uplift here.

    I was restarting the cluster (it is a cloud cluster provided by
    http://www.unbelievable-machine.net), for getting the HiveServer2
    Log and to be very sure, that everything is well set up. During
    this all tasktrackers are deleted and newly setup (HDFS and
    storage is not touched at all, neither are configs). After that
    the msck repair table stmt is going well and its actually not so
    slow at all, as i thought it might be (ca. 110 secs per table). I
    guess there might have been some logs/tmp/cache data stacked up,
    and that might have caused the errors ...

    Slightly confusing, but i will post if I find out what exactly was
    throwing the error here in the future ...

    Cheers for the help
    Wolli


    2014-03-27 11:03 GMT+01:00 Nitin Pawar <[email protected]
    <mailto:[email protected]>>:

        Without error stack, very hard to get whats wrong

        will it be possible for you to run it via hive cli and grab
        some logs through there ?


        On Thu, Mar 27, 2014 at 3:29 PM, fab wol
        <[email protected] <mailto:[email protected]>> wrote:

            Hey Nitin,

            HiveServer2 Log unfurtantely says nothing:

            Mon Mar 24 17:41:18 CET 2014 hiveserver2 stopped, pid 2540
            Mon Mar 24 17:43:22 CET 2014 hiveserver2 started, pid 2554
            Hive history
            
file=/tmp/mapr/hive_job_log_97715747-63cd-4789-9b2e-a8b0d544cdf9_2102956370
            <tel:2102956370>.txt
            OK
            Thu Mar 27 10:52:48 CET 2014 hiveserver2 stopped, pid 2554
            Thu Mar 27 10:55:52 CET 2014 hiveserver2 started, pid 2597

            Cheers
            Wolli


            2014-03-27 10:04 GMT+01:00 Nitin Pawar
            <[email protected] <mailto:[email protected]>>:

                can you grab more logs from hiveserver2 log file?


                On Thu, Mar 27, 2014 at 2:31 PM, fab wol
                <[email protected] <mailto:[email protected]>>
                wrote:

                    Hey everyone,

                    I have a table with currently 5541 partitions.
                    Daily there are 14 partitions added. I will switch
                    the update for the metastore from "msck repair
                    table" to "alter table add partition", since its
                    performing better, but sometimes this might fail,
                    and i need the "msck repair table" command. But
                    unfortunately its not working anymore with this
                    table size it seems:

                    0: jdbc:hive2://clusterXYZ-> use <DB_NAME>;
                    No rows affected (1.082 seconds)
                    0: jdbc:hive2://clusterXYZ-> set
                    hive.metastore.client.socket.timeout=6000;
                    No rows affected (0.029 seconds)
                    0: jdbc:hive2://clusterXYZ-> MSCK REPAIR TABLE
                    <TABLENAME>;
                    Error: Error while processing statement: FAILED:
                    Execution Error, return code 1 from
                    org.apache.hadoop.hive.ql.exec.DDLTask
                    (state=08S01,code=1)
                    Error: Error while processing statement: FAILED:
                    Execution Error, return code 1 from
                    org.apache.hadoop.hive.ql.exec.DDLTask
                    (state=08S01,code=1)

                    anyone had luck with getting this to work? As you
                    can see, I already raised the time until the
                    Thrift Timeout kicks in, but this error is
                    happening even before the time runs off ...

                    Cheers
                    Wolli

--Nitin Pawar

Re: MSCK REPAIR TABLE

Reply via email to