Re: [DISCUSS] OneTable proposal

2023-12-08 Thread Stephen Williams
I am interested in participating in this.  I have a particular interest 
in creating a new style of architecture for open distributed workflow 
processing that would leverage this kind of solution.


Stephen


On 12/8/23 10:37 AM, Jacques Nadeau wrote:

FYI, I'm super supportive of the spirit of this work. The goal of the
project is a noble one that has real demand.

Sorry if my initial message sounded dour! I just wanted to make sure people
were going in eyes wide open and give this project the best chance for
success.

On Fri, Dec 8, 2023, 8:28 AM Jacques Nadeau  wrote:


I've shared this feedback with Jesus prior to the proposal but would like
to do here as well. This project is trying to make multiple open source
projects work together. It would be nice to see representation from
multiple of those communities. I feel that true success of this project is
heavily influenced by its acceptance of multiple communities, not just
those hailing from one.

The name is a problem in that that it suggests inappropriate ties with
both OneLake and OneHouse, commercial products in the same space backed by
two companies the initial ipmc hail from. Normally I would be fine with the
typical "find a new name during incubation" process but the
market this technology is within is filled with fud and vitriol between
the backers of the multiple open source projects and I think it would be
best if the incubating project name had a placeholder name as opposed to
something that suggests any ASF sponsorship of vendor technologies. I
cynically suspect that if this is accepted we would then see one or more
vendors start writing about how ASF is backing OneTable before the renaming
would occur. Let's keep corporate influence out of this project from the
start.

On Mon, Dec 4, 2023, 11:25 AM Jesus Camacho Rodriguez
wrote:


Hi All,

I would like to propose a new project to the ASF incubator - OneTable.

OneTable[1] is an omni-directional converter for table formats that
facilitates interoperability across data processing systems and query
engines. Currently, OneTable supports widely adopted open-source table
formats such as Apache Hudi, Apache Iceberg, and Delta Lake.

Here is the proposal -
https://cwiki.apache.org/confluence/display/INCUBATOR/OneTable+Proposal

I would be the Champion of the project. I will mentor and help the project
through the incubator with Hitesh Shah [hit...@apache.org], Stamatis
Zampetakis [zabe...@apache.org], and Jean-Baptiste Onofré [
jbono...@apache.org].

We are looking forward to your feedback!

Thanks,
Jesús

[1]https://github.com/onetable-io/onetable


--


*Stephen D. Williams*
Founder: VolksDroid, Blue Scholar Foundation
650-450-8649  | fax:703-995-0407  | s...@lg.net 
 | https://VolksDroid.org  | 
https://BlueScholar.org  | https://sdw.st/in


Re: Continuing our work on GenXDM - port of XML Security to use it

2011-01-21 Thread Stephen Williams
I'm interested, especially as I've recently written a very lightweight SAX/DOM-like Java parser for fast parsing in Android/Java 
called Ssx (Super Simple XML).  Long dissatisfied with the verbosity and mistakes in the DOM API, and having created a couple 
simplified XML APIs in the past, Ssx has a most-concise DOM/mini-XPath API that is directly aimed at application data use.  The Ssx 
code has the ability to switch between SAX parsers on the fly, one of which is internal.


One method that I needed and that I think should be part of DOM-like APIs is a call like toXml() which returns parseable XML for any 
element.  It should either assume a given set of namespace declarations (meaning it is more of a fragment) or it should generate 
namespace declarations for everything active at the point in the tree.  Ssx does the latter so far in an efficient way.


I can't publish Ssx yet, but hopefully soon.

Additionally, we've begun the process of getting interest here in an OpenEXI incubator project.  EXI is the W3C Efficient XML 
Interchange proposed standard for compact, efficient-to-process binary XML.  We have two open source code bases (one just open 
sourced) that we are currently combining that will form the basis for the project.  When a little more complete, we will continue 
that discussion.


Stephen

On 1/21/11 9:04 AM, Eric Johnson wrote:
I've previously mentioned our GenXDM project on this mailing list. And I posted an incubator proposal (gXML at the time).  As a 
quick reminder, GenXDM defines a Java API for the XQuery Data Model, via a layer of indirection, in such a way that you can choose 
different XML tree implementations at runtime, with minimal overhead.


At the time, it appeared we didn't attract enough interest to go through with incubating at Apache.  We're still hoping to do 
that, though.


Since I posted our proposal, we've been busy.  Of particular note to this mailing list, as a proof of concept, we've done a 
complete port of the XML Security Java library (Santuario) to the GenXDM APIs.  And we've now released that over at the Apache 
Extras site.


We kept the port fully backwards compatible (all existing tests pass unmodified!), and added to the API, so that you can use 
Santuario with non DOM XML trees.


As we are still interested in incubating GenXDM at Apache, I wanted to mention our port here, as several people mentioned at the 
time that they wanted to see more, before deciding whether it made sense to get involved.


The projects:
http://code.google.com/p/genxdm/
http://code.google.com/a/apache-extras.org/p/santuario-genxdm/

We welcome you to stop by, kick the tires, and join our mailing list, as you 
see fit!

Thanks.

-Eric.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



--
Stephen D. Williams s...@lig.net stephendwilli...@gmail.com LinkedIn: http://sdw.st/in V:650-450-UNIX (8649) V:866.SDW.UNIX 
V:703.371.9362 F:703.995.0407 AIM:sdw Skype:StephenDWilliams Yahoo:sdwlignet Resume: http://sdw.st/gres Personal: http://sdw.st 
facebook.com/sdwlig twitter.com/scienteer


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



[Proposal] OpenEXI Proposal

2010-12-13 Thread Stephen Williams
 can help meet the goals of EXI to make it an adopted and utilized industry binary XML standard.


The OPENER-EXI solution is best fitted with an open and free license (such as Apache) to increase the expected likelihood of 
widespread adoption. At the same time this grants corporations the right to customize the OPENER-EXI solution and package it into 
their existing products, as they see fit, for profit. Placing a non-viral free license on the OPENER-EXI code allows it to be used 
without restrictions with proprietary source, which should encourage the corporations to adopt the solution into their codebase. 
This in turn helps to deliver a wider dissemination of EXI solutions.


Initial Goals

A series of deliberate steps are needed to accomplish these important outcomes. Project goals are listed for the various planned 
milestones of the project:


Initial configuration and setup

* Donate existing codebases from initial contributors.
* Set up the incubation infrastructure (svn repository, build scripts, test document corpus, measurements suite, regular working 
group resources, etc.) to prepare for continuous development, testing and releases.


Initial integration of Java build

* Integrate the two initial codebases (schema-less implementation and schema-informed implementation) into a single consolidated 
codebase.
* Add core format capabilities that are missing in the existing codebases. These include support for EXI header options, built-in 
datatype codecs, compression options and XML Schema regular expressions.
* Make sure all core features pass the interoperability test suite already developed by W3C EXI Working Group. TODO add links at W3C 
and NPS

* Produce an initial release that demonstrates the core features of EXI.
* Add more format capabilities to achieve complete coverage of EXI specification. These include support for XML fragments, datatype 
representation map, etc. Again validate the implementation by running the interoperability test suite.


Correctness and optimization of Java build

* Produce the second major release that provides a complete implementation of 
all EXI features in Java.
* Measure, document and profile codebase performance using the already-created JAPEX testing framework. Optimize the codebase for 
compaction efficiency and decompression performance.
* Continue releases of the Java codebase until working group consensus is achieved that the implementation is well-structured, 
efficient and high-performance.


Create and test corresponding C++ build

* Create a corresponding C++ codebase that matches the architecture of the Java codebase. Shared improvements to the common 
architecture may also be valuable at this point.

* Perform testings and optimizations as necessary to achieve comparable or 
superior performance.
* Create an Apache HTTP module that plugs in the C++ implementation and provides all configuration settings needed to ensure proper 
HTTP support for EXI.
* Continue codebase development to add EXI utility packages providing common APIs similar to SAX DOM StAX etc., for both Java and 
C++ codebases.

* Ensure that all documentation and examples are completing, matching high 
quality of other Apache work

Current Status

We are collaboratively editing and discussing this proposal. Next steps:

* We are ready to discuss this incubator proposal with the Apache Software Foundation (ASF) on the Apache Incubator list to begin 
following the Apache process.

* Please contact Stephen Williams to discuss who on the Apache team might 
sponsor and mentor this project.
* We will also move this proposal to Sourceforge openexi project, and update 
the website pages there to describe this new work.
* Our next teleconference for discussing this work is
o Monday 20 December 2010 (1500 pacific GMT-8)
o Dial +1.831.656.6500, Code 831.656.2149#

Completed progress:

* Finish draft proposal 10 November 2010 - complete
* Invitation sent to Siemens and W3C EXI Working Group members to consider 
participating or sponsoring - complete
* Proposal briefing and discussion planned for the W3C EXI Working Group 17 November 2010 teleconference - complete, positive 
response received

* Progress with Apache outreach was discussed on our 24 November 2010 
teleconference
* Based on discussion on the Apache Incubator list this proposal was moved to the Apache Incubator Wiki as the OpenExiProposal 
during our 6 December 2010 teleconference


Meritocracy

The people who have developed the codebases for initial contribution have ample experience with meritocracy-based engineering in 
multiple projects including W3C EXI Working Group and Web3D Consortium activities. In each case, standards development and 
deployment have been driven by open software development in partnership with commercial software development.


Meritocracy succeeds and flourishes when individual motivation and commitment are honored. People rise to the best possible levels 
of performance and effort when given

Introduction for Stephen D. Williams (sdw)

2010-12-03 Thread Stephen Williams

For the purposes of supporting the about to be proposed EXI incubator project, 
I'll introduce myself here:

I have a long background [3] in software development and related activities.  In addition to IETF (presence/IM) and W3C 
(XBC,EXI) participation for many years, I have long (since the mid-80s) used, promoted, and contributed to open source 
projects.  Recently, in addition to OpenEXI, I have been working on a fork of XBig for Java-C++ glue (overdue for release), 
creating a new Java PC simulator / visualizer as an applet (NSF grant with a professor at American University), and created a 
new very concise Java Dom/Sax parser (pending license release).  I have also been finding and working around bugs, limitations, 
and Apache software versioning issues for Android development.  At one point I was listed as a Linux kernel contributor (albeit 
for something minor).  I have been a CTO at a few startups, including, briefly, Jabber.com, Inc.  I'm currently an independent 
self-employed consultant (OptimaLogic, Inc.) doing deep Android Java/C++ and Qt/C++ development involving GUI interface and 
complex Oauth-like REST web APIs with crypto, XML, and related technologies.  Other recent development involves computer vision, 
augmented reality, and semantic desktop related information visualization, management, and sharing.


Thanks,
sdw
--
Stephen D. Williams s...@lig.net stephendwilli...@gmail.com LinkedIn: 
http://sdw.st/in
V:650-450-UNIX (8649) V:866.SDW.UNIX V:703.371.9362 F:703.995.0407
AIM:sdw Skype:StephenDWilliams Yahoo:sdwlignet Resume: http://sdw.st/gres
Personal: http://sdw.st facebook.com/sdwlig twitter.com/scienteer




Ready to propose EXI incubator candidate

2010-12-03 Thread Stephen Williams
This is a call for a champion and mentors for an incubator project we are about to propose formally for implementations of the 
W3C EXI specification and related technologies.  The current proposal abstract, proposal, and background for EXI can be found at 
the end of this message for reference[4].


Some current and former members of the W3C EXI [1] / XBC [2] working groups are interested in submitting an Apache Incubator 
proposal for EXI.  Open EXI is an existing open-source project with a partial EXI implementation under the Apache 2.0 license.  
Another more-complete commercially-developed implementation is being released soon, also under the Apache 2.0 license.  Further 
contributions will be made soon.


We have a proposal, to be released shortly, that seems complete with an initial set of five committers.  We believe we now need 
a champion and nominated mentors.  I have agreed to lead this effort and would be happy to act as a mentor if promoted to that 
status.


[1] W3C Efficient XML Interchange http://www.w3.org/XML/EXI/
[2] XML Binary Characterization http://www.w3.org/XML/Binary/
[3] http://sdw.st/gres

[4] Current Open-EXI Apache Incubator Proposal introduction to EXI:

Abstract

Efficient XML Interchange (EXI) is a forthcoming W3C Recommendation for 
compression and high performance decompression of XML. This standard has wide 
applicability to all forms of XML documents and consistently beats zip/gzip in 
terms of compactness. Multiple software implementations are beginning to 
emerge. This work will establish a high performance open source codebase in 
both Java and C++ that can immediately be used in bandwidth-limited 
environments and other software applications that are not currently well served 
by XML. It may later may integrated into http servers and clients.

Proposal

This proposal seeks to create a project within the Apache Software Foundation 
to develop an implementation of the current EXI Candidate Recommendation, and 
to track changes to the Candidate Recommendation as is progresses to an 
approved W3C standard. The initial implementation will be in Java, and a 
subsequent C++ implementation will follow. Once implemented the EXI standard 
could be used in many other Apache projects, such as the web server, web 
services, etc.

The EXI specification is available at the EXI Working Group Public Page. A 
Primer on EXI is available there, as are an evaluation of the likely impacts 
and best practices. An evaluation and measurement note are available; these 
notes are a product of the test framework results.

Background

Since the inception of XML, it has been noticed that a good number of data 
exchange application scenarios seemed to fit the use of XML very appealing, 
only to find XML inhibitive given its sometimes very costly inefficiency of 
inherent verbosity. Legacy applications involving data exchange, for example, 
typically use non-XML data formats (e.g. ASN.1 PER) that predate XML, are often 
far more efficient and in some cases hand-optimized to achieve the best 
performance result. When such applications attempt to harness the numerous 
benefits of XML, it is not unusual that they find XML helplessly bulky to adopt 
given the bandwidth constraints of the existing communication infrastructures 
that were designed with the currently used format in mind. Another example is a 
data-intensive mobile application for which bandwidth is at a premium and the 
use of XML is not very realistic due to its substantive disadvantage at 
bandwidth conservation. While there are some other use cases that address the 
bloated message size issue with general-purpose compression methods such as 
GZip, the application of such methods unfortunately more often than not 
compound the efficiency issue for those use cases aforementioned because GZip 
usually degrades the processing efficiency dramatically and has little or no 
impact on the message size when individual message is short.

Over the years, there have been developed numerous file formats purported to 
serve as alternative, efficient representation of XML data. W3C's (World Wide 
Web Consortium) XBC WG (XML Binary Characterization Working Group) in 2005 
found that most, if not all of those formats are not very general in the sense 
that they had been each designed to target a particular problem domain and do 
not serve well use cases of other domains. In 2006, W3C launched the EXI 
(Efficient XML Interchange) WG with the charter to conduct study and formulate 
a single alternative format that provides utmost efficiency better than the 
customarily used formats (e.g. ASN.1 and GZip) do and even competes with 
hand-optimized formats, with broadest coverage of use cases and platforms 
including those that had not been well served by XML, and yet is compatible 
with XML and integrates well with existing XML family of standards and 
applications without major disruption.

As of this writing, EXI is a W3C Candidate Recommendation, and