Hey Suresh,

Yep. Depending on Airavata's repository search needs, we can also pull in 
Apache Lucene, and Solr, as we need to. I'm very familiar 
with those technologies and a former member of the Lucene PMC so I know those 
guys and their technology well.

Cheers,
Chris

On Aug 9, 2011, at 7:28 PM, Suresh Marru wrote:

> 
> On Aug 9, 2011, at 1:25 PM, Mattmann, Chris A (388J) wrote:
> 
>>> It indeed looks like a very active project and the reference implementation 
>>> for JCR, thank for the pointer. I was poking through the documentation, but 
>>> did not get yet get my hands dirty. It might be quick to ask you, do you 
>>> know how easy will it be to add custom schemas and make the content of the 
>>> document searchable? For example, can I add a WSDL or a BPEL document and 
>>> find out across the repository which of the application services wsdl's 
>>> wrap Gaussian molecular chemistry model? This is a just an illustrative 
>>> example, but I am curious how the indexes will be built for content and how 
>>> bad the performance will be if we make lot of content searchable. 
>> 
>> I definitely think you can do this, as you can define user-tags on the 
>> content items at each node in the repository and then search for those nodes 
>> later on. It's probably best to sign up to [email protected] and 
>> ask there but that's based on my limited understanding of the system.
> 
> Thanks Chris for this additional information.
> 
> I will create some JIRA tasks so we can try out JCR and Jackrabbit for some 
> simple repository tasks in gfac and xbaya. I think Airavata will have more 
> complicated repository tasks, but to start with we can try simple examples. 
> As a long term task I think it will be better we consolidate all Airavata 
> repository needs so we can create interfaces and try out different 
> implementations before we agree upon one. 
> 
> Suresh
> 
> 
>> Thanks,
>> Chris
>> 
>>> 
>>> Thanks for your insights,
>>> Suresh
>>> 
>>>> 
>>>> Cheers,
>>>> Chris
>>>> 
>>>> On Aug 9, 2011, at 9:55 AM, Suresh Marru wrote:
>>>> 
>>>>> Hi All,
>>>>> 
>>>>> We are stalled on this thread, so how about getting to a consensus. Since 
>>>>> I did not see any further discussion on the use of schemas, should we 
>>>>> assume we want to retain XML Schemas and add simplified beans to easily 
>>>>> work with instead of generated xmlbeans? The schemas for reference are at 
>>>>> [1]. Also, as Patanachai explained in the original message below, there 
>>>>> are three types of schema documents for GFAC to describe the 
>>>>> computational host, application deployment description and finally 
>>>>> service interface. Using these three descriptions, a application service 
>>>>> wsdl is generated and GFAC manages the deployed application on various 
>>>>> computational resources. There is a mapping between these deployment 
>>>>> descriptions. I am reading the JCR API document [2] and intrigued by the 
>>>>> relevance. But my inference is from a theoretical stand point and 
>>>>> wondering if any one on the list has experience good and bad on working 
>>>>> against JCR spec.
>>>>> 
>>>>> Suresh
>>>>> 
>>>>> [1] - 
>>>>> https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/
>>>>> [2] - http://jcp.org/en/jsr/detail?id=283
>>>>> 
>>>>> On Aug 1, 2011, at 12:07 AM, Suresh Marru wrote:
>>>>> 
>>>>>> Hi Patanachai,
>>>>>> 
>>>>>> Thanks for explaining the issue in detail. In simple terms, we need 
>>>>>> multiple client components register a description about an application 
>>>>>> and store it in a registry. GFac will need to pull the registered 
>>>>>> description document and execute and manage the compute job. Along with 
>>>>>> XBaya as the client which registers the document, there are other 
>>>>>> clients including a gadget interface. 
>>>>>> 
>>>>>> I agree that the current scheme has to revisited (and fix minor issues 
>>>>>> like you mention about the gridftp tags). But  moving from xmlschema to 
>>>>>> a light weight option is a bigger question. With a proper bean 
>>>>>> generation library and serializing/deserializing methods I personally 
>>>>>> favor xml schema but I do not want to be biased either. I am -1 for POJO 
>>>>>> simply because it will limit non-java bases clients like a simple php 
>>>>>> web form. JSON in general sounds like a good alternative, but I do not 
>>>>>> experience with it in a validation and schema sense. 
>>>>>> 
>>>>>> I will wait for others to chime in, if there are no better alternatives 
>>>>>> suggestion, I will import the missing GFac schema from code donation 
>>>>>> into a commons area - 
>>>>>> https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/
>>>>>> 
>>>>>> Cheers,
>>>>>> Suresh
>>>>>> 
>>>>>> On Jul 29, 2011, at 2:09 PM, [email protected] wrote:
>>>>>> 
>>>>>>> Hi devs,
>>>>>>> 
>>>>>>> I want to discuss about the type system in GFAC-Core.
>>>>>>> 
>>>>>>> Currently, GFAC module read and write a necessary information based on 
>>>>>>> XML
>>>>>>> schema (called GFAC-Schema) as a definition. GFAC-Schema library is
>>>>>>> generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced 
>>>>>>> in
>>>>>>> the project.
>>>>>>> 
>>>>>>> Examples of GFAC-Schema are:
>>>>>>> HostTypeDescription, which describes an environment for a host such as 
>>>>>>> Java
>>>>>>> version, Temp directory, GridFTP endpoint etc.
>>>>>>> ServiceTypeDescription, which describes a service such as parameters,
>>>>>>> service name, etc.
>>>>>>> GFAC-SimpleType, which defines a simple parameter type to the service 
>>>>>>> such
>>>>>>> as Boolean, Double, Integer, etc.
>>>>>>> 
>>>>>>> This is how system work roughly:
>>>>>>> After deploying their software on a computing host, users will register
>>>>>>> their host, application, service description via XBaya-GUI (Java Swing).
>>>>>>> This registration information will be saved to XRegistry as XML string
>>>>>>> according to XML schema.
>>>>>>> When users invoke a (Web) service, GFAC will load the necessary 
>>>>>>> information
>>>>>>> (host, application directory, parameters, etc.) and execute the deployed
>>>>>>> software .
>>>>>>> Then, GFAC parses the output from the software, wraps it and send out 
>>>>>>> as an
>>>>>>> appropriate parameter type format.
>>>>>>> 
>>>>>>> 
>>>>>>> So, the question is do we want to continue using XML-Schema.
>>>>>>> If, we agree to use XML-Schema, we should import some initial schema 
>>>>>>> from
>>>>>>> OGCE GFAC as a new module in Airavata. Also, we need to redesign some
>>>>>>> schema.
>>>>>>> For Instance, current HostType schema requires GridFTP Endpoint element
>>>>>>> which is not necessary if a computing host doesn't have GridFTP.
>>>>>>> 
>>>>>>> Otherwise, what do you propose? POJO, JSON, etc.
>>>>>>> 
>>>>>>> -- 
>>>>>>> Best Regards,
>>>>>>> Patanachai Tangchaisin
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Chris Mattmann, Ph.D.
>>>> Senior Computer Scientist
>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>>> Office: 171-266B, Mailstop: 171-246
>>>> Email: [email protected]
>>>> WWW:   http://sunset.usc.edu/~mattmann/
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Adjunct Assistant Professor, Computer Science Department
>>>> University of Southern California, Los Angeles, CA 90089 USA
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> 
>>> 
>> 
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: [email protected]
>> WWW:   http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [email protected]
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to