Hi Dan and all,

Here is the basic class diagram for the domain entitiies in RB :
http://yuml.me/825d7db5

Please note that I have used the name EmailContact instead of
EmailSenderProfile for clarity purpose. Effectively this entity represents
the email contacts in the user's inbox.

Each email and email contact will have a corresponding Reputation entity.
And in the view  models, EmailReputationViewModel will display emails with
their reputation data and ContactReputationViewModel will display email
contacts with their reputation data in the RB web application.

Your ideas and suggestions are most welcome.

Thanks,
Dileepa


On Tue, Mar 25, 2014 at 3:42 PM, Dileepa Jayakody <dileepajayak...@gmail.com
> wrote:

> Hi Dan,
>
> Thanks a lot for your insight. Please see my comments inline below.
>
>
> On Tue, Mar 25, 2014 at 1:21 PM, Dan Haywood <d...@haywood-associates.co.uk
> > wrote:
>
>> Hi Dileepa,
>>
>> I've just posted the comments below on your GSOC proposal.  I know that
>> you can't make further changes to the proposal, so I'm posting them here on
>> the dev list, so we can keep the conversation going.
>>
>> So..
>>
>> * good to see you intend to set up a project on github for this; please
>> do this asap.  That way you can start to capture docs/working notes.  I
>> also suggest that you set up github pages for your site [1].
>>
>
>> * What I'd like to see right now is some sort of UML diagram; you could
>> sketch one using yuml.me [2] and add it to your github site.  I can't
>> quite work out how the persistent domain entities relate to each other.  In
>> particular, are EmailSenderProfile and Reputation in 1-1 correspondence?
>>
>
> I will draw a ER diagram for the domain entities and we can enhance it
> over discussions.
> Yes I pictured EmailSenderProfile as the representation of an email sender
> (a contact) and each email sender will have a corresponding reputation
> score (accumulated and normalized reputation-score over the emails sent by
> him) represented by the Reputation domain entity.
>
>
>>
>> * In your timeline I noticed you said "Commit all code to github", only
>> on Aug 11.  It's much better practice (and will help mentors guide you) if
>> you commit changes as you go.  That way it's also safely backed up, and you
>> can go back in time if you mess up.
>>
>
> Yes I agree, in fact I didn't mean I'm going to commit all code at once
> only on Aug 11. I meant to say I'm planning to finish development  and
> commit everything by Aug 11.
> I strongly agree on getting feedback along the way of development, after
> all I'm looking at using agile development for my project :). Sorry for
> having interpreted my idea in a misleading way on the proposal.
>
>>
>> * You might also want to version control the academic paper, too, if your
>> university lets you.
>>
>>
>> Some further points relating to the design:
>>
>> * You have Email as a persistent entity.  I'm a bit worried what that
>> might mean about storage and also synchronization.  Is it necessary to have
>> the Email persisted in Isis?  If not persisted, then should the Email
>> entity be a view model, or as a fake persistent entity utilizing a new
>> StoreManager impl in JDO.  See the recent thread [3] on this topic.
>>
>> Email entity will have several attributes such as : id, sender-id,
> reputation-score. sender-id will be mapped to the EmailSenderProfile and
> reputation-score will be a score given by the ML process evaluating the
> reputation of the email. Could email-entity be a view model in this
> scenario? If so what is the advantage of defining it as a view-model?
>
> I think we can discuss more on this with a ER diagram for the application.
> I will come up with a ER diagram asap.
>
>
>> * Conversely, does Mahout require some sort of persistent dataset of
>> emails in order to do the reputation scoring?  Or does it just hold
>> aggregated information?  If the former, I worry that we now have each email
>> stored in potentially 3 places: gmail, Isis and Mahout.  Keeping these in
>> sync would be a nightmare.
>>
>
> AFAIK Mahout process requires a persistent dataset (file based or database
> based) to train the classifier and it will build a classifier-model (an
> aggregated information structure on how to classify new data). Mahout will
> not persist email data again.
> Therefore I feel Mahout will need access to the email dataset either
> straight from gmail as the datasource of from a Isis datasource (after
> retrieving all Emails to Isis).
> If you think retrieving and storing all emails in Isis is not a good idea,
> maybe the EmailService can be implemented only as a connector from gmail >
> mahout.
>
>>
>> * It occurs to me that you're going to need some entities to keep track
>> of the high water mark of the most recently analyzed email, so that when
>> you poll for new emails you know which to ask for.  This high water mark is
>> per user of RB.  So I think you'll either need an entity to represent your
>> RB User, or you could use the UserSettings service [4][5]
>>
>
> Yes I will definitely need to have an entity to represent the RB User. In
> fact User management aspect will also be key in the  application since one
> user should not be able to access the other's email, reputation data.
> Thanks for the suggestions. Will it be a good idea to extend the
> UserSettings entity to represent RB specific user data or have a separate
> entity for RB_User?
>
>
>>
>> * In the proposal there's the term "reputation index" is associated with
>> the email sender.  Is that the same as "Reputation".
>>
>
> Yes. I wanted to imply initial reputation analysis process will generate
> the initial reputation scores for all past emails and create Reputation
> profiles for each EmailSender by saying "building the reputation index"
>
>>
>> * The initial download of emails for analysis probably needs to be done
>> using a multiple batches (of say 100 at a time), in case there's a
>> glitch/network issue.
>>
>
> Agreed. I think the Isis BackgroundService can be used for this?
>
>>
>>
>> * I was interested to note that you see the Isis webapp as being an email
>> client itself.  I suggest you keep it as read-only, though... otherwise
>> you'll end up reinventing all of gmail (not advisable, think).
>>
>
> Yes, I would have the webapp as a readonly and demo purpose application.
> basically as a presentation layer of the viewmodel :
>  EmailReputationViewModel to display the recent emails and their repuation
> information as well as reputation profiles of the email senders.
>
>>
>> * One of the first tasks you've set yourself (til 21 Apr) is to "try out
>> Apache wicket samples [10] to learn how to develop the presentation layer
>> of the application".  In fact, with Isis you don't need to do any
>> presentation layer coding; start building out your prototype and you'll see
>> what I mean.
>>
>
> I wanted to try out Apache wicket to get an understanding of the Wicket
> configurations, programming model to develop view-models. :)
>
>>
>> * I'm still unsure about oAuth integration.  The EmailService is going to
>> require credentials to access gmail, and that's "within" the Isis domain
>> model.  But Shiro/buji-pac4j sits in front of Isis.  If Shiro has done the
>> oAuth sign-in, then I guess it'll be necessary to surface those credentials
>> somehow to the EmailService (perhaps using Shiro's
>> org.apache.shiro.SecurityUtils#getSubject() method.  Perhaps the best thing
>> is to get buji-pac4j done, then see what information is surfaced that way.
>>
> Yes, this requires some bit of research. I wanted to implement RB as a
> webapplication which doesn't ask the user's email credentials to perform
> the reputation analysis process. In the worst-case it will require the
> user's email credentials to perform the EmailService's email retrieval
> process.
>
> In summary, thanks a lot for your insight into the project. I will setup a
> github project and come up with an ER diagram asap.
>
> Thanks,
> Dileepa
>
>>
>>
>> HTH
>> Dan
>>
>> [1] http://pages.github.com/
>> [2] http://yuml.me/
>> [3] http://isis.markmail.org/thread/lsg3uywlfjviztzi
>> [4] http://isis.apache.org/reference/services/settings-services.html
>> [5]
>> http://isis.apache.org/components/objectstores/jdo/services/settings-services-jdo.html
>>
>>
>

Reply via email to