from:"Jim Hughes"

Unit/Integration tests with Mini Accumulo Cluster

2019-12-13 Thread Jim Hughes

Hi all, I work on GeoMesa and for the Accumulo 1.x line, we have been using the MockAccumulo infrastructure for our unit/integration tests which run in a Maven build. In Accumulo 2.x, since MockAccumulo is gone, we're looking at using the MiniAccumulo cluster infrastructure. Are there best

Accumulo on S3

2020-03-03 Thread Jim Hughes

Hi all, The next major release of GeoMesa is aimed at supporting Accumulo 2.x. As part of testing, my coworker Kevin and I are trying out Accumulo 2.0 on S3. Keith's blog post[1] is great. As people have tested Accumulo 2.0 in AWS, has anyone tried using EMR for the underlying HDFS cluster

Re: reading rfiles directly

2020-08-03 Thread Jim Hughes

Good question. As a very general note, one can leverage Hadoop InputFormats to create Spark RDDs. As a rather non-trivial example, you could check out GeoMesa's implementation of mapping Accumulo entries to geospatial data types. The basic strategy is make a Hadoop Configuration object repre

Accumulo on S3 tuning for write performance

2021-06-09 Thread Jim Hughes

Hi all, We are trying a large ingest using Accumulo on S3 and we are seeing some exceptions around writes to S3. The blog post about Accumulo on S3[1] suggests setting "fs.s3a.connection.maximum" to 128. Similar advice for HBase seems to suggest bumping that value even higher. Does anyone

Unit/Integration tests with Mini Accumulo Cluster

Accumulo on S3

Re: reading rfiles directly

Accumulo on S3 tuning for write performance

4 matches

Site Navigation

Mail list logo

Footer information