This is an automated email from the ASF dual-hosted git repository.
djkevincr pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/gora.git
The following commit(s) were added to refs/heads/master by this push:
new 2d0d910 GORA-664 Add datastore for Elasticsearch (#234)
2d0d910 is described below
commit 2d0d91093dd7bceb07a4d64fb6b075bdc2117abd
Author: Maria Podorvanova <36038640+podorvan...@users.noreply.github.com>
AuthorDate: Thu Aug 12 04:22:22 2021 +1000
GORA-664 Add datastore for Elasticsearch (#234)
* Create basic gora-elasticsearch module
* Bump Elasticsearch version and remove redundant dependency
* Implement connection and basic schema management
- Create ElasticsearchStore class with connection initialization
- Create basic Elasticsearch types mapping
- Implement the necessary files for mapping representation
(ElasticsearchMapping, ElasticsearchMappingBuilder)
- Read schema from mapping file
- Cover initialization with test
* Set up Elasticsearch client parameters
- Created gora.properties file with configuration properties
- Loaded connection parameters from configuration
- Implemented connection to Elasticsearch cluster with
ElasticsearchParameters
- Covered ElasticsearchParameters with tests
- Added javadoc descriptions
* Add a property for choosing the authentication method
* Implement testing with Elasticsearch container
- Added testing dependencies
- Added GoraElasticsearchTestDriver with Elasticsearch container
- Added javadoc descriptions to GoraElasticsearchTestDriver class
- Fixed two existing tests in accordance to Elasticsearch container
* Implement some methods for schema management
Implemented schemaExists, createSchema, deleteSchema and flush methods
* Add XSD validation file for the XML mapping
* Fix XSD validation
- Relocated gora-elasticsearch.xsd file to main resources
- Covered XSD validation with test
- Added gora-elasticsearch-mapping-invalid.xml file for test
* Set up Elasticsearch container's authentication parameters
* Implement exists method
* Add comments for the connection parameters
* Fix authentication
- Set up password to Elasticsearch container properly
- Set default Elasticsearch container server’s username in gora.properties
- Added exceptions for missing arguments in authentication
* Add parameter for the XSD validation
- Defined a parameter for the XSD validation
- Added a test case for the parameter
- Made ElasticsearchStore read mapping file from properties, not
configuration
* Implement some basic Input-Output operations for schema management
- Implemented delete, get and put methods
- Implemented newInstance and getUnionSchema utility methods
- Implemented basic serialization/deserialization for primitive AVRO types
* Fix createSchema method
- Added mappings while creating an Elasticsearch index
- Added getter and setter to Datatype enum
* Implement serialization/deserialization for some Avro data types
- Implemented serializeFieldValue and deserializeFieldValue methods for
ARRAY, BOOLEAN, BYTES and FIXED Avro data types
- Fixed deserialization for STRING Avro data type
- Added javadoc descriptions
* Fix NPE when getting a non-existent Elasticsearch document
* Implement serialization/deserialization for MAP Avro data type
* Refactor serialization/deserialization to have better javadocs and
arguments
* Implement serialization/deserialization for RECORD Avro data type
* Implement serialization/deserialization for UNION Avro data type
* Fix passed Schema argument for ARRAY deserialization
* Fix BYTES deserialization for Base64 encoded String
* Ignore testGet3UnionField test
* Add javadoc descriptions to serialization and deserialization methods
* Implement newQuery method
* Implement deleteByQuery method
* Use an Enum instead of literal strings for the Authentication Type
parameter
* Use parameterized logging instead of string concatenation
* Implement execute method
* Implement getPartitions method
* Add scaling_factor support
* Remove unsupported Elasticsearch data types
* Implement Metadata Analyzer for Elasticsearch Store
* Try to fix range query by “_id” field
* Fix execute method by adding a special "gora_id" field
* Implement deleting specific fields of the records in deleteByQuery method
* Implement MapReduce test
* Fix flush method by using refresh
* Address reviewer's comments
* Add Elasticsearch specific logging dependency
---
gora-elasticsearch/pom.xml