release-3.1.2/ql/src/java/org/apache/hadoop/hive/ql/parse/TableMask.java#L86-L155
> > > and some experiments I did:
> > >
> > >
> >
> https://docs.google.com/document/d/1LYk2wxT3GMw4ur5y9JBBykolfAs31P3gWRStk21PomM/edit?usp=sharing
> > >
> > > Kurt mentions that traditional dbs like DB2 are in behavior (b). I
> think
> > we
> > > need to decide which behavior we'd like to support. The pros for
> behavior
> > > (a) is no security leak. Because user X can't guess whether there are
> > some
> > > customers with phone number '123456789'. The pros for behavior (b) is
> > users
> > > don't need to rewrite their existing queries after admin applies column
> > > masking policies.
> > >
> > > What do you think?
> > >
> > > Thanks,
> > > Quanlong
> > >
> >
>
--
Todd Lipcon
Software Engineer, Cloudera
Perhaps others have some reason why it wouldn't work well, though.
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
one already and I'm starting Kudu
> build there in at attempt take a look at the issue later tonight.
>
> I'll keep you posted on my findings.
>
>
> Kind regards,
>
> Alexey
>
> On Wed, Jun 19, 2019 at 2:53 PM Todd Lipcon wrote:
>
>> This same
.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected
> TLS_HANDSHAKE step: SASL_INITIATE
> W0612 04:24:57.910481 8897 heartbeater.cc:587] Failed to heartbeat to
> 127.0.0.1:7051 (722 consecutive failures): Not authorized: Failed to ping
> master at 127.0.0.1:7051: Client connection n
Otherwise it's very easy to
introduce code that uses features not available on el7, for example.
>
> On Wed, May 22, 2019 at 10:41 AM Todd Lipcon wrote:
>
> > On Mon, May 20, 2019 at 8:36 PM Jim Apple wrote:
> >
> > > Maybe now would be a good time to implemen
sounds great to
me :) Personally I don't develop on Ubuntu 18 and in my day job it's not a
particularly important deployment platform, so I personally don't think
I'll spend much time triaging that build.
Todd
>
> On Mon, May 20, 2019 at 9:09 AM Todd Lipcon wrote:
>
ity members who have made this happen!
>
> Should we add Ubuntu 18.04 to our pre-merge Jenkins job, replace 16.04 with
> 18.04 in our pre-merge Jenkins job, or neither?
>
> I propose adding 18.04 for now (ans so running both 16.04 and 18.04 on
> merge) and removing 16.04 when it start
at they were passing after he fixed the final set of issues
> there.
>
> - Tim
>
--
Todd Lipcon
Software Engineer, Cloudera
rit.cloudera.org/c/12639/. The next
> > step
> > > is
> > > > > to
> > > > > > > get
> > > > > > > > a
> > > > > > > > Jenkins job running, which I've been working on.
> > > > > > > >
> > > > > > > > I'd like to run it regularly so we can catch any regressions.
> > > > > Initially
> > > > > > > > I'll just have it email me when it fails, but after it's
> stable
> > > > for a
> > > > > > > week
> > > > > > > > or two I'd like to make it part of the regular set of jobs.
> > > > > > > >
> > > > > > > > My preference is to run it as part of the precommit jobs, in
> > > > parallel
> > > > > > to
> > > > > > > > the Ubuntu 16.04 tests. It should not extend the critical
> path
> > of
> > > > > > > precommit
> > > > > > > > because it only runs the end-to-end tests. We could
> > alternatively
> > > > run
> > > > > > it
> > > > > > > as
> > > > > > > > a scheduled post-commit job, but that tends to create
> > additional
> > > > work
> > > > > > > when
> > > > > > > > it breaks.
> > > > > > > >
> > > > > > > > What do people think?
> > > > > > > >
> > > > > > > > - Tim
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
--
Todd Lipcon
Software Engineer, Cloudera
Posted a review: http://gerrit.cloudera.org:8080/13117
On Thu, Apr 25, 2019 at 8:37 AM Bharath Vissapragada
wrote:
> +1, never used it!
>
> On Thu, Apr 25, 2019 at 8:20 AM Tim Armstrong
> wrote:
>
> > +1 - do it!
> >
> > On Thu, Apr 25, 2019 at 1:43 AM Todd
t did load, it looks like it's only a set of ~10 trivial queries
which I doubt have any real benchmark interest in modern days.
Anyone mind if I excise this cruft?
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
; > on t1.a_id = t2.a_id
> > > ) t;
> > > +--+
> > > | count(1) |
> > > +--+
> > > | 2|
> > > +--+
> > > Here is the result of two subquery without count(1):
> > > +--+-+-+
> > > | a_id | amount1 | amount2 |
> > > +--+-+-+
> > > | 1| 30 | 30 |
> > > | 2| NULL| NULL|
> > > +--+-+-+ why the count(1) of this
> > > resultset is 1?
> > > +--+-+
> > > | a_id | amount1 |
> > > +--+-+
> > > | 1| 30 |
> > > | 2| NULL|
> > > +--+-+ why the count(1) of this
> > > resultset is 2?
> > > I want to ask why the first sql return just 1, but second return 2,is
> > this
> > > correct or impala bug?How impala deal with count aggr.?
> > > If I change the sum to other aggr. function like count/max/min, result
> is
> > > same. I test this on 2.12.0 and 3.1.0 version.
> > >
> > >
> >
>
--
Todd Lipcon
Software Engineer, Cloudera
Ah, try now. Seems I added you to the wrong group
On Fri, Mar 29, 2019 at 10:33 AM Jim Apple wrote:
> Hm, didn't seem to work.
>
> On Thu, Mar 28, 2019 at 9:28 AM Todd Lipcon wrote:
>
> > I think you're all set. Give it a shot?
> >
> > -Todd
> >
I think you're all set. Give it a shot?
-Todd
On Tue, Mar 26, 2019 at 8:05 PM Jim Apple wrote:
> Hello! I am now using "jbapple", not "jbapple-cloudera", as my gerrit
> handle. Can someone with an admin login give me the auths to +2 changes?
>
> Thanks in
l
> > is useful.)
> >
> > I stared at that code for a bit, and I agree with you that it's
> plausible.
> > I'm also confused by the "bottom-up" comment of generateFilters(): it
> seems
> > like we walk the plan depth first, and the assignment
s "cousin" is beneficial. I think the only necessary restriction
is that a RF should not be sent from a hash join node to any descendent of
its right child.
Keep in mind I'm very new to the Impala planner code and particularly to
the runtime filter portion thereof, so I may have missed something. But
does the above sound like a plausible bug/missed optimization?
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
I have a WIP patch up here in case anyone's interested in taking an early
look: https://gerrit.cloudera.org/c/11388/
-Todd
On Tue, Sep 4, 2018 at 5:02 PM, Todd Lipcon wrote:
> On Tue, Sep 4, 2018 at 4:46 PM, Andrew Sherman
> wrote:
>
>> Hi Todd,
>>
>> I'm
a line of code yet :-D
-Todd
>
>
> -Andrew
>
> On Tue, Sep 4, 2018 at 4:43 PM Todd Lipcon wrote:
>
> > On Tue, Sep 4, 2018 at 4:28 PM, Andrew Sherman
> > wrote:
> >
> > > Hi Todd,
> > >
> > > I am making a simple fix for
&g
coordinate
which one goes in first?
-Todd
>
>
> On Tue, Sep 4, 2018 at 4:02 PM Todd Lipcon wrote:
>
> > Hey folks,
> >
> > I'm working on a patch to add some more diagnostics from the planning
> > process into query profiles.
> >
> > Curr
or a set of well-known fields defined as a thrift enum instead of
relying on strings.. but I dont want to bite off too much in one go here :)
>
> -- Philip
>
>
> On Tue, Sep 4, 2018 at 4:02 PM Todd Lipcon wrote:
>
> > Hey folks,
> >
> > I'm working on a patc
ith a full TRuntimeProfileNode. I'd then
add some capability in the Java side to fill in counters, etc, in this
structure.
Any concerns with this approach before I go down this path? Are there any
compatibility guarantees I need to uphold with the profile output of
queries?
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
On Thu, Aug 30, 2018 at 2:48 PM, Todd Lipcon wrote:
> On Thu, Aug 30, 2018 at 2:44 PM, Pooja Nilangekar <
> pooja.nilange...@cloudera.com> wrote:
>
>> Hi Todd,
>>
>> I believe you are right. There are a couple of other race conditions in
>> the
>&g
scanner threads and scan node are hard to reason
about and could be improved more generally.
For now I'm trying to use a DebugAction to inject probabilistic failures
into the soft memory limit checks to see if I can reproduce it more easily.
-Todd
> On Thu, Aug 30, 2018 at 1:27 PM Todd L
ay be interesting to look at:
> - is there any scan node in the profile which doesn't finish any assigned
> scan ranges ?
> - if you happen to have a core, it may help to inspect the stack traces of
> the scanner threads and the disk io mgr threads to understand their states.
>
>
n.
So, there are still two mysteries:
- why did it get stuck in the first place?
- why are my "number of running queries" counters stuck at non-zero values?
Does anything above ring a bell for anyone?
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
I'm curious: can you describe the view using Hive to see what the stored
query consists of?
On Thu, Aug 23, 2018, 7:44 PM Quanlong Huang
wrote:
> Hi all,
>
> After we upgrade Impala from 2.5 to 2.12, we found some queries on views
> failed. The views contain a query hint which I think is the cau
it would be fine for
the update callback to take a long time, no?
-Todd
On Tue, Aug 21, 2018 at 11:09 AM, Todd Lipcon wrote:
> Thanks, Tim. I'm guessing once we switch over these RPCs to KRPC instead
> of Thrift we'll alleviate some of the scalability issues and maybe we can
&g
nt to the statestore to schedule the subscriber update
> sooner. This would also work for admission control since coordinators could
> notify the statestore when the first query was admitted after the previous
> statestore update.
>
> On Tue, Aug 21, 2018 at 9:41 AM, Todd Lipcon wrote:
>
s (4 seconds).
Has anyone looked into optimizing this at all? It seems like we could have
metadata changes trigger an immediate "collection" into the C++ side, and
have the statestore update callback wait ("long poll" style) for an update
rather than skip if there is nothing available.
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
accordingly?
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
ink it has promise longer-term as it simplifies the overall
architecture and can help with cross-system metadata consistency. But we
can treat fetch-from-catalogd as a nice interim that should bring most of
the performance and scalability benefits to users sooner and with less risk.
I'll plan to update the original design document to reflect this in coming
days.
Thanks
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
does seem cleaner and our GCC and Clang versions are modern
> > enough to support it.
> >
> > What do people think about switching to that as the preferred way of
> > including headers only once?
> >
> > - Tim
> >
>
--
Todd Lipcon
Software Engineer, Cloudera
rote:
> >Hi Quanlong,
> >
> >Thank you for the incident note! You might be interested in
> >https://gerrit.cloudera.org/#/c/10998/ which is adding some
> instrumentation
> >to make it easier to notice with monitoring tools that we're running out
> of
> >m
ged lines? Then we could more easily just
run a single command to ensure that our patches are properly formatted
before submitting to review. Or, at the very least, some instructions for
running the same flake8-against-only-my-changed-lines that gerrit is
running?
-Todd
>
>
> On Mon, Jul 30, 20
ou for reading!
>
>
> Further thoughts:
> * We might need to mention in the docs that adding java UDFs may not use
> the given jar file if the impala CLASSPATH already contains a jar file
> containing the same class.
> * We should avoid using Hive builtin UDFs and any other Java UDFs since
> their memory is not tracked.
> * How to track memory used in JVM? HBase, a pure java project, is able to
> track its MemStore and BlockCache size. Can we learn from it?
>
>
> Thanks,
> Quanlong
> --
> Quanlong Huang
> Software Developer, Hulu
--
Todd Lipcon
Software Engineer, Cloudera
; test the basic mechanism for a bit now.
> >
> > It excludes docs/ commits.
> >
> > Let me know if you see any problems with it.
> >
> > Thanks,
> > Tim
> >
>
--
Todd Lipcon
Software Engineer, Cloudera
requirement. However, it's unclear to me if any client of Impala
> > makes assumption about the ordering of the output in PrintErrorMap(). So,
> > sending this email to the list in case anyone knows anything.
> >
> > --
> > Thanks,
> > Michael
> >
>
--
Todd Lipcon
Software Engineer, Cloudera
, Pooja Nilangekar <
pooja.nilange...@cloudera.com.invalid> wrote:
> I am still having the same issue. (I haven't tried git clean either).
>
> Thanks,
> Pooja
>
>
> On Wed, Jul 18, 2018 at 5:30 PM Todd Lipcon
> wrote:
>
> > Anyone else still having problems?
ble.
>
> On Wed, Jul 18, 2018 at 2:16 PM Fredy Wijaya wrote:
>
> > A CR to fix the issue: https://gerrit.cloudera.org/c/10981/
> > I'm running a dry-run to make sure everything is good with the new build
> > number.
> >
> > On Wed, Jul 18, 2018 at 12:24
d338
> E0718 10:03:13.659101 23734 impala-server.cc:289] NoClassDefFoundError:
> org/apache/hadoop/fs/Options$ChecksumCombineMode
> E0718 10:03:13.659122 23734 impala-server.cc:292] Aborting Impala Server
> startup due to improper configuration. Impalad exiting.
>
--
Todd Lipcon
Software Engineer, Cloudera
On Tue, Jul 17, 2018 at 5:27 PM, Sailesh Mukil wrote:
> On Tue, Jul 17, 2018 at 2:47 PM, Todd Lipcon
> wrote:
>
> > Hey folks,
> >
> > I'm working on a regression test for IMPALA-7311 and found something
> > interesting. It appears that in our normal minicl
erent spoofed username so that the
minicluster environment is more authentic to true cluster environments? We
can do this easily by setting the HADOOP_USER_NAME environment variable or
system property.
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
On Thu, Jul 12, 2018 at 5:07 PM, Bharath Vissapragada <
bhara...@cloudera.com.invalid> wrote:
> On Thu, Jul 12, 2018 at 12:03 PM Todd Lipcon
> wrote:
>
>
> > So, I think my proposal here is:
> >
> > 1. Query behavior on existing tables
> > - If the tab
ormat is non-Avro,
- AND the table contains column types incompatible with Avro (eg tinyint),
- THEN disallow changing the file format of an existing partition to Avro
-Todd
On Wed, Jul 11, 2018 at 9:32 PM, Todd Lipcon wrote:
> Turns out it's even a bit more messy. The presence of
though still I don't think that
would iterate over all partitions in the case of a mixed table.
-Todd
On Wed, Jul 11, 2018 at 9:03 PM, Bharath Vissapragada <
bhara...@cloudera.com.invalid> wrote:
> Agreed.
>
> On Wed, Jul 11, 2018 at 8:55 PM Todd Lipcon
> wrote:
>
> >
to do this. If
> someone
> > asked me to support a mixed avro/parquet table I would suggest they
> create
> > a view. If they kept insisting I would reply "Well it is your funeral."
> >
> > On Wed, Jul 11, 2018 at 7:51 PM, Todd Lipcon
> > wrote:
> >
posed new behavior, we can avoid looking at all partitions. This is
important for any metadata design which supports fine-grained loading of
metadata to the coordinator.
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
nfusion about what the best practice is, particularly
> for
> > > > people coming from other code bases. I personally like the
> distinction,
> > > but
> > > > I don't feel that strongly about it.
> > > >
> > > > What do people think? Should we continue using scoped_ptr or move
> away
> > > from
> > > > it. There is already a JIRA to make the change but we haven't done it
> > > > because of the above reasons:
> > > > https://issues.apache.org/jira/browse/IMPALA-3444
> > > >
> > > > - Tim
> > > >
> > >
> >
>
--
Todd Lipcon
Software Engineer, Cloudera
our 12,300 customers." Such a comment might be by a
fellow Cloudera employee, or by someone at some other contributor. Happy to
remove that heading if it seems like it's not inclusive.
-Todd
> On Fri, Jun 8, 2018 at 9:54 AM, Todd Lipcon wrote:
>
>> On Thu, Jun 7, 2018 at 9:
the amount of
load on source systems. The downsides, though, are:
- extra hop between impalad and source system on a cache miss
- extra complexity in failure cases (source system becomes a SPOF, and if
we replicate it we have more complex consistency to worry about)
- scalability benefits may only be re
wrote:
> https://lists.apache.org/thread.html/74a3f3f945403b50515c658047d328
> 4955288a637207e4f97ecc15d1@%3Cgeneral.incubator.apache.org%3E
>
> I think it’s worth considering how the two communities could work together
> for the benefit of all.
>
--
Todd Lipcon
Software Engineer, Cloudera
t-after-passing-build bit.
>
Will this give the proper "+1 verified" mark on a gerrit such that it can
be committed?
-Todd
> On Fri, Jun 1, 2018 at 11:35 AM, Todd Lipcon wrote:
>
> > Hey folks,
> >
> > Would someone mind generating an account for me on
y review/commit privileges.
Thanks
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
/kudu-table-sink.cc#L151
> However, when building with -static in ./buildall.sh, the kudu-client is
> still linked dynamically (see `ldd be/build/latest/service/impalad`). Is
> there a build option to link it statically?
>
>
> Thanks,
> Quanlong
--
Todd Lipcon
Software Engineer, Cloudera
Hey Impala devs,
Over the past 3 weeks I have been investigating various issues with
Impala's treatment of metadata. Based on data from a number of user
deployments, and after discussing the issues with a number of Impala
contributors and committers, I've come up with a proposal for a new design.
he jars. Vice versa, if I only updated
something in the front end I'd only re-copy the FE jar to the cluster. That
way you only pay the expensive deployment step the first time you set up
your cluster.
-Todd
> At 2018-04-26 00:49:05, "Todd Lipcon" wrote:
> >Hi Quanlong,
really needed. This work is tedious and prone to errors.
>
>
> Is there a best practice for packaging and distributing the binaries after
> compiling?
>
>
> Thanks,
> Quanlong
--
Todd Lipcon
Software Engineer, Cloudera
of alignment and the
> default allocator only guarantees 16 bytes
> [clang-diagnostic-over-aligned]
> boost::shared_ptr impala_server(new
> ImpalaServer(&exec_env));
>
--
Todd Lipcon
Software Engineer, Cloudera
-experimental-flags. Users who
decide to flip that on are explicitly acknowledging that they're treading
on some unproven ground and they might get crashes, correctness issues, etc.
Of course the goal should be that, if after a release or two of use the
feedback from users is that it works well and few issues have been found,
I'd expect it to be marked non-experimental.
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
59 matches
Mail list logo