How compatible with other engines is the insert only transaction type. Very often data is loaded with spark, especially for cases with complex types where it's the only option. Will landing parquet files in the table path just work even if we don't get consistent inserts or does spark need to be aware of the table format in either case?
-Shant On Thu, May 7, 2020 at 3:09 PM Sahil Takiar <takiar.sa...@gmail.com> wrote: > +1 on query results spooling, I've been thinking about enabling it by > default recently since it seems to be relatively stable. > > On Thu, May 7, 2020 at 11:41 AM Tim Armstrong <tarmstr...@cloudera.com> > wrote: > > > I'm going to revive this thread. I thought of a few more defaults that we > > might want to change. These are default changes we (putting on Cloudera > hat > > temporarily) have made for some new production deployments and have been > > happy with. > > > > Query result spooling has a bunch of advantages for resource consumption > > and fetch speed. It uses a bounded amount of memory and scratch space, > but > > I think it's overall a better default. We've been using it in production > > for a while now and haven't had any issues. > > > > > https://impala.apache.org/docs/build/html/topics/impala_spool_query_results.html > > > > I think we should also switch the default file format to parquet, because > > it's more correct (default text has some issues with escaping) and > because > > it's more performant. > > > > > https://impala.apache.org/docs/build/html/topics/impala_default_file_format.html > > > > We could also consider creating insert_only transactional tables by > default > > - > > > > > https://impala.apache.org/docs/build/html/topics/impala_default_transactional_type.html > > . > > The pros and cons here are more complex - we get more consistent > behaviour > > by default, but there can be perf/scalability consequences. > > > > Any objections or thoughts on these? > > > > On Thu, Mar 19, 2020 at 4:44 PM Tim Armstrong <tarmstr...@cloudera.com> > > wrote: > > > > > I think ARM support can ship in whatever release it's reading in, since > > > it's not a breaking change. > > > > > > On Wed, Mar 18, 2020 at 9:43 PM 赵 仁海 <zhaoren...@hotmail.com> wrote: > > > > > >> Thanks > > >> I will work hard on this ^_^ > > >> > > >> ________________________________ > > >> 发件人: Jim Apple <apa...@jbapple.com> > > >> 发送时间: 2020年3月19日 10:21 > > >> 收件人: dev@impala.apache.org <dev@impala.apache.org> > > >> 主题: Re: Impala 4.0 breaking changes > > >> > > >> I agree. I don’t know how far we are from having arm64 support, > though, > > >> and > > >> we might not get there for a 4.0 release, I’d guess. But that doesn’t > > mean > > >> it couldn’t arrive by the time for 4.1 or 4.7 or 5.55 or whatever. > > >> > > >> On Wed, Mar 18, 2020 at 6:32 PM Joe McDonnell < > > joemcdonn...@cloudera.com> > > >> wrote: > > >> > > >> > Patches to add support for arm64 are definitely welcome in any > > release. > > >> > > > >> > Thanks, > > >> > Joe > > >> > > > >> > On Mon, Mar 16, 2020 at 6:11 PM 赵 仁海 <zhaoren...@hotmail.com> > wrote: > > >> > > > >> > > Hi > > >> > > > > >> > > Could we add support for arm64? > > >> > > > > >> > > Thanks > > >> > > Zhao Renhai > > >> > > > > >> > > ________________________________ > > >> > > 发件人: Joe McDonnell <joemcdonn...@cloudera.com> > > >> > > 发送时间: 2020年3月17日 1:07 > > >> > > 收件人: dev@impala.apache.org <dev@impala.apache.org> > > >> > > 主题: Impala 4.0 breaking changes > > >> > > > > >> > > Now that Impala 3.4 is branched and master is Impala 4.0, we need > to > > >> > decide > > >> > > what breaking changes will happen in Impala 4.0. I have provided a > > >> series > > >> > > of proposals below. I welcome feedback on them. Other proposals > are > > >> also > > >> > > welcome. > > >> > > > > >> > > Thanks, > > >> > > Joe > > >> > > > > >> > > Proposal 0: Hadoop component versions > > >> > > > > >> > > Switch to CDP versions of components by default. This means that > > >> Impala > > >> > > will use Hive 3+ (which is already essentially Hive 4 and may > change > > >> > names > > >> > > to being Hive 4). > > >> > > Remove support for CDH versions of components. > > >> > > This was already discussed in the original thread for Impala 4, so > > >> this > > >> > is > > >> > > not new. > > >> > > > > >> > > Proposal 1: OS support > > >> > > > > >> > > Drop support for Centos 6, Ubuntu 14, and Debian (all versions) > > >> > > Retain support for Ubuntu 16, Ubuntu 18, Centos 7, and SLES 12 > > >> > > Centos 7 development will be focused on newer Centos 7 versions > such > > >> as > > >> > 7.6 > > >> > > and 7.7. > > >> > > Add support for Centos 8 > > >> > > Move main development from Ubuntu 16 to Ubuntu 18 over time. > > >> > > > > >> > > Proposal 2: Python support > > >> > > > > >> > > Drop support for Python 2.6 > > >> > > Add support for Python 3 over time. > > >> > > > > >> > > Proposal 3: Impala-lzo > > >> > > > > >> > > Drop support for Impala-lzo/hadoop-lzo > > >> > > > > >> > > Proposal 4: Clients > > >> > > > > >> > > Deprecate beeswax protocol. This means that it can be removed in > the > > >> next > > >> > > major version number, but it would not be removed in Impala 4. > > Current > > >> > > users of beeswax would need to start migrating to HS2. > > >> > > > > >> > > Proposal 5: Sentry > > >> > > > > >> > > Drop support for Sentry in favor of Ranger. > > >> > > > > >> > > Proposal 6: Metadata > > >> > > > > >> > > Metadata V2 will become the default. Metadata V1 will be > deprecated. > > >> > > > > >> > > Thanks, > > >> > > Joe > > >> > > > > >> > > > >> > > > > > > > > -- > Sahil Takiar > Software Engineer > takiar.sa...@gmail.com | (510) 673-0309 >