I think that's right. My testing (not very scientific) puts it on par for
redshift for the datasets I use.
On Sunday, August 7, 2016, Edward Capriolo wrote:
> A few entities going to "kill/take out/better than hive"
> I seem to remember HadoopDb, Impala, RedShift ,
estruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 7 August 2016 at 13:17, Ma
Will CREATE TABLE sales5 AS SELECT * FROM SALES; not work for you?
On Thu, Aug 4, 2016 at 5:05 PM, Nagabhushanam Bheemisetty <
nbheemise...@gmail.com> wrote:
> Hi I've a scenario where I need to create a table from partitioned table
> but my destination table should not be partitioned. I won't
d using the latest
> orc-core lib (1.1.2). That seems not to be the same implementation for orc
> files access as being used in hive.
>
>
> Thanks for all hints!
>
>
>
> Am Mittwoch, 3. August 2016, 08:45:45 CEST schrieb Marcin Tustin:
> > Yes. Cr
Yes. Create an external table whose location contains only the orc file(s)
you want to include in the table.
On Wed, Aug 3, 2016 at 7:53 AM, Johannes Stamminger <
johannes.stammin...@airbus.com> wrote:
> Hi,
>
>
> is it possible to write data to an orc file(s) using the hive-orc api and
> to
>
operty which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 14 July 2016 at 23:29, Marcin Tustin <mtus...@handyboo
What do you want it to do? There are at least two web interfaces I can
think of.
On Thu, Jul 14, 2016 at 6:04 PM, Mich Talebzadeh
wrote:
> Hi Gopal,
>
> If I recall you were working on a UI support for Hive. Currently the one
> available is the standard Hadoop one on
and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage
Quick note - my experience (no benchmarks) is that Tez without LLAP (we're
still not on hive 2) is faster than MR by some way. I haven't dug into why
that might be.
On Tue, Jul 12, 2016 at 9:19 AM, Mich Talebzadeh
wrote:
> sorry I completely miss your points
>
> I was
This is because a GZ file is not splittable at all. Basically, try creating
this from an uncompressed file, or even better split up the file and put
the files in a directory in hdfs/s3/whatever.
On Tue, Jun 21, 2016 at 7:45 PM, @Sanjiv Singh
wrote:
> Hi ,
>
> I have big
Mich - it sounds like maybe you should try these benchmarks with alluxio
abstracting the storage layer, and see how much it makes a difference.
Alluxio should (if I understand it right) provide a lot of the optimisation
you're looking for with in memory work.
I've never used it, but I would love
Hi All,
I have a database backed by an s3 bucket. When I try to drop that database,
I get a NullPointerException:
hive> drop database services_csvs cascade;
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:java.lang.NullPointerException)
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn *
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress
They're not simply interchangeable. sqoop is written to use mapreduce.
I actually implemented my own replacement for sqoop-export in spark, which
was extremely simple. It wasn't any faster, because the bottleneck was the
receiving database.
Is your motivation here speed? Or correctness?
On Sat,
w latency here? Are you referring to the
>>> performance of SQL against HBase tables compared to Hive. As I understand
>>> HBase is a columnar database. Would it be possible to use Hive against ORC
>>> to achieve the same?
>>>
>>> Dr Mich Talebzadeh
>>
w
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 18 April 2016 at 23:43, Marcin Tustin <mtus...@handybook.com> wrote:
>
>> HBase has a different use case -
HBase has a different use case - it's for low-latency querying of big
tables. If you combined it with Hive, you might have something nice for
certain queries, but I wouldn't think of them as direct competitors.
On Mon, Apr 18, 2016 at 6:34 PM, Mich Talebzadeh
wrote:
>
This is a classic transform-load problem. You'll want to anonymise it once
before making it available for analysis.
On Thursday, March 17, 2016, Ajay Chander wrote:
> Hi Everyone,
>
> I have a csv.file which has some sensitive data in a particular column
> in it. Now I
ill be great if you can attach a small enough repro for this issue. I
> can verify it and provide a fix in case of bug.
>
> Thanks
> Prasanth
>
> On Mar 8, 2016, at 5:52 AM, Marcin Tustin <mtus...@handybook.com
> <javascript:_e(%7B%7D,'cvml','mtus...@handybook.com')
I you wish to keep it in its current location consider creating an external
table.
On Saturday, March 12, 2016, Rex X wrote:
> Hi Mich,
>
> I am doing this, because I need to update an existing big hive table,
> which can be stored in any arbitrary customized location on
CCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 7 March 2016 at 23:25, Marcin Tustin <mtus...@handybook.com> wrote:
>
>> Hi All,
>>
>> Following on from from our parquet vs orc discussion, today I observed
>> hive's alter ta
I believe updates and deletes have always had this constraint. It's at
least hinted at by:
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-ConfigurationValuestoSetforINSERT,UPDATE,DELETE
On Mon, Mar 7, 2016 at 7:46 PM, Mich Talebzadeh
Hi All,
Following on from from our parquet vs orc discussion, today I observed
hive's alter table ... concatenate command remove rows from an ORC
formatted table.
1. Has anyone else observed this (fuller description below)? And
2. How to do parquet users handle the file fragmentation issue?
Don't bucket on columns you expect to update.
Potentially you could delete the whole row and reinsert it.
On Sunday, March 6, 2016, Ashok Kumar wrote:
> Hi gurus,
>
> I have an ORC table bucketed on invoicenumber with "transactional"="true"
>
> I am trying to update
If you google, you'll find benchmarks showing each to be faster than the
other. In so far as there's any reality to which is faster in any given
comparison, it seems to be a result of each incorporating ideas from the
other, or at least going through development cycles to beat each other.
ORC is
Hi All,
I'm seeing some data loss/corruption in hive. This isn't HDFS-level
corruption - hdfs reports that the files and blocks are healthy.
I'm using managed ORC tables. Normally we write once an hour to each table,
with occasional concatenations through hive. We perform the writing using
spark
That is the expected behaviour. Managed tables are created within the
directory of their host database.
On Tuesday, 19 January 2016, 董亚军 wrote:
> hi list,
>
> we use the HDFS and S3 as the Hive Filesystem at the same time. here has
> an issue:
>
>
> *scenario* 1:
>
>
See this:
http://stackoverflow.com/questions/23082763/need-to-add-auto-increment-column-in-a-table-using-hive
On Sat, Jan 16, 2016 at 11:52 AM, Ashok Kumar wrote:
> Hi,
>
> Is there an equivalent to Microsoft IDENTITY column in Hive please.
>
> Thanks and regards
>
--
I second this. I've generally found anything else to be disappointing when
working with data which is at all funky.
On Wed, Jan 13, 2016 at 8:13 PM, Alexander Pivovarov
wrote:
> Time to use Spark and Spark-Sql in addition to Hive?
> It's probably going to happen sooner or
ed
> recipient, you should destroy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Technology
> Ltd, its subsidiaries or their employees, unless expressly so stated. It is
> the responsibility of the recipient to ensure that this ema
You can join on any equality criterion, just like in any other relational
database. Foreign keys in "standard" relational databases are primarily an
integrity constraint. Hive in general lacks integrity constraints.
On Sun, Jan 10, 2016 at 9:45 AM, Ashok Kumar wrote:
> hi,
lume
> one out shortly
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should destroy it immed
Yes, that's why I haven't had to compile anything.
On Wed, Dec 30, 2015 at 4:16 PM, Jörn Franke <jornfra...@gmail.com> wrote:
> Hdp Should have TEZ already on-Board bye default.
>
> On 30 Dec 2015, at 21:42, Marcin Tustin <mtus...@handybook.com> wrote:
>
> I'm afraid
Hi All,
We import our production database into hive on a schedule using sqoop.
Unfortunately, sqoop won't update the table schema in hive when the table
schema has changed in the source database.
Accordingly, to get updates to the table schema we drop the hive table
first.
Unfortunately, this
34 matches
Mail list logo