Re: kudu-spark2 Scala 2.12 support

2020-05-21 Thread Denis Bolshakov
Hello,

Pavel, my two cents:
looks like information on https://spark.apache.org/docs/2.4.5/: is not
fully correct.

On download page there is
Download Apache Sparkā„¢

   1.

   Choose a Spark release: 3.0.0-preview2 (Dec 23 2019)2.4.5 (Feb 05 2020)
   2.

   Choose a package type:  Pre-built for Apache Hadoop 2.7  Pre-built for
   Apache Hadoop 3.2 and later  Pre-built with user-provided Apache
Hadoop  Source
   Code
   3.

   Download Spark: spark-3.0.0-preview2-bin-hadoop2.7.tgz
   
<https://www.apache.org/dyn/closer.lua/spark/spark-3.0.0-preview2/spark-3.0.0-preview2-bin-hadoop2.7.tgz>
   4.

   Verify this release using the 3.0.0-preview2 signatures
   
<https://downloads.apache.org/spark/spark-3.0.0-preview2/spark-3.0.0-preview2-bin-hadoop2.7.tgz.asc>
   , checksums
   
<https://downloads.apache.org/spark/spark-3.0.0-preview2/spark-3.0.0-preview2-bin-hadoop2.7.tgz.sha512>
and project release KEYS <https://www.apache.org/dist/spark/KEYS>.

Note that, Spark is pre-built with Scala 2.11 except version 2.4.2, which
is pre-built with Scala 2.12.

So, available binaries are based on scala 2.11 (not 2.12, 2.12 is used by
default in Spark 2.4.2, so probably documentation was updated while
releasing spark 2.4.2 but not roll backed.

But the way I agree, it would be nice to have support for scala 2.12 if
there is no any blockers.

Kind regards,
Denis

On Thu, 21 May 2020 at 12:35, Pavel Martynov  wrote:

> Hi, folks!
>
> Looks like the last Spark release 2.4.5 completely dropped Scala 2.11
> support. See https://spark.apache.org/docs/2.4.5/: "For the Scala API,
> Spark 2.4.5 uses Scala 2.12. You will need to use a compatible Scala
> version (2.12.x).".
>
> Could you, please, build and publish to Maven repo kudu-spark2 Scala 2.12
> version of lib?
>
> Thanks!
>
> --
> with best regards, Pavel Martynov
>


-- 
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com


Re: Why per tablet server 's upper limit is 4TB.

2017-08-30 Thread Denis Bolshakov
Mike Percy answers to @kinglee (from Kudu Slack channel)
there are multiple issues that interact but one issue is that if you have
many tablets you will use many threads. Adar has been focusing on improving
density lately and trying to quantify the scaling limits.


On 30 August 2017 at 13:22, yuyunliuhen  wrote:

> "Recommended maximum amount of stored data, post-replication and
> post-compression, per tablet server is 4TB."
> what will happen if the data more than 4T? the disk is large than before.
> 6T a disk is is common, there any test data or doc?
>
>


-- 
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com


what is your typical size of tablet.

2017-08-30 Thread Denis Bolshakov
Hello Kudu community,

Could you please share your typical single tablet size of a table?


-- 
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com