Archive for the 'in the wild snapshot' Category

Shameless self-promotion

My in the wild posts got some attention this week and I felt compelled to share them with you – like I said, shameless self-promotion 🙂

  • DataPortability.org May Progress report, click on either the HTML or PDF link to see the actual report.
  • Trent Adams, an active DataPortability.org contributor and founder of MatchMine, interviewed me on my in the wild posts for his regular podcast series. You can listen to my podcast interview here. While you are there, check out his other podcast interviews with people like Plaxo – Joseph Smarr, Chief Platform Architect, John McCrae, VP Marketing; Google – Kevin Marks, Developer Advocate for OpenSocial, Robert Scoble, Managing Director of FastCompany.tv, etc.

As a side note, I plan on changing the focus of my startup, JiggyMe (video aggregation site), to feature only technology videos, so feel free to add your technology videos there.

In the wild snapshot#3: DiSo profile plugin

I had an excellent conversation with Stephen Paul Weber, an active DiSo plugin developer, on his experience with the DiSo profile plugin. For those of you unfamiliar with this series of posts, the idea is to create blog-length interviews with various in the wild apps describing their processes and the technologies that they use with regards to data portability. The goal is to profile real use cases, solutions, and lessons learned when it comes to the current state of affairs for data portability technology. Note that these posts aren’t meant to recommend or not recommend certain technology, I leave that up to the developers/architects to decide based on their needs. If you have such an app and are interested in being interviewed, please leave me a comment on one of my posts and I will get in touch with you.

DiSo Project Background
Straight from the DiSo Google group About page

Social networks are becoming more open, more interconnected, and more distributed. Many of us in the web creation world are embracing and promoting web standards – both client-side and server-side. Microformats, standard apis, and open-source software are key building blocks of these technologies.

DiSo (dee • zoh) is an umbrella project for a group of open source implementations of these distributed social networking concepts. or as Chris Messina puts it:“to build a social network with its skin inside out”.

You can also listen to an interview by Chris Messina on Chris Messina about DiSo.

At this stage, DiSo plugins only work on self-hosted WordPress blogs which means if you have a blog on wordpress.com, you are out of luck. Also, all DiSo plugins currently are written in PHP, WordPress’s choice of language. Visit the WordPress site to get instructions on how to host your own WordPress blog and install plugins.

Application Overview
The DiSo WordPress profile plugin has the following main features

  • When a user signs up for a WordPress account, the plugin makes it easier to import the user’s profile information via hCard and XFN (if available)
  • Once a user has signed up for a WordPress account, the plugin makes it easier for the user (now blog owner) to publish their own profile with standards like hcard, XFN
  • Supports permission features allowing blog owner to restrict access to his information based on predefined relationship, e.g., I can’t see his phone number but friends of him who login with their OpenID and are present on his authorized list of friends can see his phone number
  • There is a sidebar widget that displays names/avatars of most recently logged in visitors

Technology
The key technical pieces are hCard, XFN (rel-me, rel-contact), PHP5, and standard WordPress plugin architecture. The plugin should work on WordPress 2.0 and above, and has been tested on 2.3 through 2.5. Currently the plugin mimics SGAPI functionality without the FOAF bit. Also, FOAF was considered but not implemented, another item for the future perhaps. He plans to add Google Social Graph API (SGAPI) support, but it wasn’t available when the plugin was written, so it is something to consider for the future – Steve Ivy wrote a PHP wrapper for SGAPI.

While the plugin works with OpenID, it does not require OpenID. There is a button to import profile and can fetch profile information if it is not an OpenID URL. OpenID profile extraction for XFN and hCard is automatic upon registeration and login. For OpenID feature to work, it needs the WP-OpenID plugin. No other libraries or plugins are required, in fact the import button works fine if the WP-OpenID is not installed. To display the user’s profile, the user needs to add a WordPress template tag. There is a page token for rendering on a WordPress page and a PHP function for addition directly to the template (documented on the plugin page). So far, most people don’t use it as a sidebar widget and instead display their profile information inline in the blog.

For an example of the plugin in action, check out Stephen’s blog, it powers the top half of his main page and the avatars of recent visitors in his sidebar.

Lessons learned
Some people have hCard on their OpenID pages via OpenID delegation usually or directly on the page. A large number of people have rel-me links going to their main profile somewhere else. In his opinion, the biggest hurdle is still HTML parsing in PHP which is surprsing to me since PHP is such a popular web development language. Event though PHP has excellent XML support but if the HTML is broken or incomplete as it is often the case in the world wild web, there is no library to handle that. An option is to fix it with HTML Tidy but most shared service providers (like DreamHost) do not have HTML Tidy installed. Without HTML Tidy, the plugin has to run the page through W3C remote tidy proxy which can be slow. Another option is to use HTML Purifier which is a re-implementation of HTML Tidy in PHP.

The current plugin user base is primarily DiSo developers and he has not gotten any feedback from non-DiSo developers. He noted that there is a goofy WordPress thing where the permissions model is based on the contacts list but WordPress only supports one blogroll list, so everyone on that list has the same permission. This is not a problem for most blogs but it could be a problem for multi-authors blog. There is no affiliation with WordPress other than it is a WordPress plugin.

In the wild snapshot#2: floe.tv

An excellent in the wild post written by Josh Patterson about his floe.tv project with some feedback from me, so the credit really goes to him. For those of you unfamiliar with this series of posts, the idea is to create blog-length interviews with various in the wild apps describing their processes and the technologies that they use with regards to data portability. The goal is to profile real use cases, solutions, and lessons learned when it comes to the current state of affairs for data portability technology. Note that these posts aren’t meant to recommend or not recommend certain technology, I leave that up to the developers/architects to decide based on their needs. If you have such an app and are interested in being interviewed, please leave me a comment on one of my posts and I will get in touch with you.

Application Overview
Basically with floe.tv we started out just wanting to try out some video ideas we had. We started out with a simple playlist of videos on the internet linked together at a site, which evolved into a full own video editor. After a number of months of development we hit a point where the team sat down with some users and did a testing session, asking questions and gauging responses to see how well we were hitting our marks. We came to the subject of data storage, local hardrives, and getting media online, and just as a thought exercise I asked “well, what if floe.tv just knew about all your online media by your login name, and referenced it automatically in your libraries the first time you logged in — just as if it was an app installed locally on your hd?” and immediately both of them became excited and one asked “can I do that right now? when can I use that?” and I knew from experience that the market was speaking very loudly and clearly in my direction, and that I had better listen very closely.

The very next meeting I posed this question to our team:

What if our app was “inherently installed” in the internet? What if someone logged in, and the app just acted like a desktop app that “knew” about your flickr images, your youtube videos, it knew about your myspace friends, facebook friends, and automatically treated them as one logical database, one logical social graph? And someone started right into an app tutorial right off the bat with their contacts, files, and assets already referenced (but fully respecting privacy, control, etc)?

So the next question naturally becomes “that all sounds really great, but … how do we get there?”

From there we really began to push a “what if this/that” scenario, and drew up our ideas into a document entitled WRFS and from that we began to re-engineer the floe.tv app towards a truly linked data experience.

Technology
We are currently using FLEX/as3 for the editor and player with ASP.NET for the server technology. Discovery is a big deal with how I view next gen web apps — dynamically finding data at runtime without having to go through “Big Data’s” walls to get to it. How I see this happening is

  1. the user logs in with an identity url, be it an openID or a XRDS file location. floe.tv is an openID only application, but user’s can map multiple openIDs to one account with it.
  2. the app authenticates the user, and then uses multiple fallback methods to find the XRDS-S file if not specified.
  3. The XRDS file is parsed, and relevant data types (here image and video) location uris are pulled out
  4. Each uri points to a data container, which is then queried via its API for a list of resources the user has stored there.
  5. these results are then aggregated back together into a single “recordset” to be returned to the floe.tv application layer

The fallback methods for XRDS discovery (done in the FLEX client) are ordered as:

  1. First do basic yadis discovery to see if the identity delegate is the openID provider or a blog of some sort head link delegation setup. Either the openID provider or the XRDS location may here, or both. In some cases, such as the DiSo XRDS plugin, the XRDS file located in the head link tag will have the openID provider location.
  2. The secondary method we have kicked around is to query the openID provider for an Attribute Exchange key that points to the XRDS file. This is not well defined yet but has been discussed amongst various groups.
  3. Lastly we fallback to having our flex app prompt the user for a XRDS url so that we can “enhance their user experience”.

So although I think our secondary option with the AX key is a little shaky right now, overall we degrade gracefully.

For a quick demo of how some of the data query and aggregation mechanics might work, I’ve built a short demo illustrating the step by step mechanics.

Lessons learned
We aren’t done with this application, obviously, and a lot of work remains to be done – I should note that I am currently at Step 2 of 5 as stated above. However, the application is evolving into the embryo of what I think a linked data application can and will be. What I can share are the places that we are actively looking for solutions, simple decentralized solutions, that solve these issues.

One thing that is slowly changing is the perception of cross domain scripting on the internet. As services get more complex, and require more aggregation of data from multiple sources, we are going to push more data handling duties to the client, as scalability will suffer. For flash crossdomain scripting we need the crossdomain.xml file present on the server of the api would like to call for it to work. This is a trivial thing to setup as it consists of a simple xml file located off of the subdirectory level you wish to grant access to.
Examples:
http://api.flickr.com/crossdomain.xml
http://api.floe.tv/crossdomain.xml
Once this file is exposed, the flash runtime will then allow calls from the client to those servers.

For cross domain client side javascript, things get a little bit trickier. A lot of cross domain tricks, such as widget embedding, is done via iframe embeds. This type of embedding significantly restricts access to the rest of the page, so effectively the widget is isolated in terms of interaction with the rest of the page DOM. Firefox 3 will allow cross domain client side scripting with certain http headers being present on the server response from the remote cross domain endpoint. I’m not sure how future versions of Internet Explorer will address this issue, but I think evolutionary pressure from both Firefox and Flash will yield some effect there. A new development that is supposed to make javascript more secure is Google Caja. I’ve begun to follow Google Gaja but I’ve yet to deeply jump into that project.

We’ve been waiting to see how the discovery wars pan out, as it stands now XRDS-S is looking like the service index of choice amongst the big players (presently Yahoo is endorsing it, hiring Eran Hammer Lahav as their Open Standards Evangelist. How the XRDS resource is discovered automatically without a tremendous amount of user interaction is something that we are taking many approaches towards, as discussed above. For now we’re going to focus on finding XRDS files as our catalog of service endpoints for a user. The DiSo project is going to be publishing its user’s service endpoints in the XRDS format and already has a plugin for it, so I think in the short term we’ll be focusing on consuming that data in terms of an early conceptual demo in runtime linked data.

Once we find the XRDS file, we aren’t out of the woods. How do I set my XRDS file up so that I can tell the floe.tv application that I have “images in flickr”? This is a fundamental question being setup and worked on at the site http://www.xrdstype.net/ by many different groups and people, and has yet to fully be fleshed out.

There’s also the issue of once we find a data endpoint, how do we talk to it? ATOM? Some sort of standard data interop api that spits out RDF? In a perfect world I’d love to see a self organizing web, a linked data web that can find a data endpoint at runtime, find its semantic schema, and wire itself such that it can talk to that api without ever needing user intervention — it simply will understand how to talk to it.

Another key development will be the permission system, possibly using OAuth+Discovery to automate the updating of a XRDS-S file for someone when they add data to a service. I need to learn a lot more about OAuth and the direction its heading. I’d prefer a world where the user doesn’t need to manually go to a site to allow their resources to be used by a third party, but for now, this is how we have to operate.

So really, I think I have more “lessons underway” than “lessons learned”, but sharing this information is key since it sparks interest in other like minded developers who may know a lot more about some of these areas than me. There’s going to be some places that we punt and just hard code some scaffolding code in to just get going, but over time I’d like to evolve towards a linked data web that auto discovers new connections at runtime and self organizes to give a smarter and far more intuitive user experience than we’ve seen so far.

If you are interesting in linking data or trying some data interop experiments, please feel free to email me (jpatterson at floe.tv) or check out the WRFS workgroup or my blog http://jpatterson.floe.tv

Feedback and suggestions are welcomed.

In the wild snapshot#1: Lessons from my XFN coding experiment

In an offline conversation with Chris Messina, we discussed the idea of creating blog-length interviews with various in the wild apps describing their processes and the technologies that they use with regards to data portability. The goal is to profile real use cases, solutions, and lessons learned when it comes to the current state of affairs for data portability technology. I am using the term “data portability technology” loosely here and is in no way affiliated with the ongoings of DataPortability.org.

So I am giving it a go and see what comes of it because we both think this kind of information can be useful to others looking to understand the lay of the data portability land. As such, I will title all such future posts starting with “In the wild snapshot…” as well as assign the category (WordPress terminology) of “in the wild snapshot”. If any of you are interested in doing such an interview, leave a comment on here and I will get in touch with you. Note that these posts are generally meant for web developers but everyone is welcomed to read it of course.

First up, I interviewed myself on my recent XFN coding experiment, neat how that works.

Application Overview
Given the abundance of XFN producers available, I wanted to create a XFN consuming application instead. If you need an introduction to rel-me and XFN, check out my earlier post here. The basic idea is to extract XFN information from a URL and present it in a human readable form, in my case, grouping rel-me entries into “My Online Identities” and rel-contact entries into “My contacts”, that’s it, pretty simple thing to do.

Technology
Technology considered: XFN, FOAF, Javascript, JSON, DOM, server side platform (like Ruby on Rails, etc), Google Social Graph API, Google Social Graph Test Parser, lab.backnetwork ufXtract microformats parser

Technology used: XFN, Javascript, JSON, DOM, CSS, lab.backnetwork ufXtract microformats parser

To begin with, I considered client side (Javascript, JSON, DOM, CSS) vs. server side (Ruby on Rails) platforms and went with client side technologies primarily because I had a good example client side app to start with, courtesy of Kevin Marks (OpenSocial advocate and microformats founder). You will notice the very similar layout and fonts, I like to reuse code.

The next question is selecting an appropriate XFN parser. I can either try to find some Javascript library or write my own or use a 3rd party service. To make things easier, I decided to go with a 3rd party service. I have 2 choices to pick from 3rd parties, lab.backnetwork microformats parser and Google Social Graph API. I decided to use lab.backnetwork parser primarily because it parse pages in real-time whereas Google Social Graph API only parse pages crawled by Googlebot which can result in data staleness. With lab.backnetwork parser, I used the JSON callback to process the JSON data structure passed back by the parser. Once I have the JSON data, I then sliced and diced it to dynamically generate additional HTML using Javascript, DOM, and CSS.

If you want more details on how to use Javascript to call the lab.backnetwork parser, check out this excellent post Javascript badges powered by JSONP and microformats. Extracted from the post, here’s the script tag code calling lab.backnetwork parser

var script = document.createElement('script');
script.type = "text/javascript";
Badge.obj = badge;
script.src = "http://lab.backnetwork.com/ufXtract/?url=" + escape(link.href) + "&format=xfn&output=json&callback=Badge.build";
head[0].appendChild(script);

Lessons learned
As a newcomer to XFN, this is a good way, at least for me, to learn about XFN. lab.backnetwork parser works pretty well for extracting XFN information especially since it provides real-time parsing. However, unlike Google Social Graph API, it doesn’t currently parse FOAF. FOAF is a competing standard to XFN but can be used in conjunction with XFN. Here’s a post about XFN and FOAF. From the few profile pages I have seen, it is possible for people to use both XFN and FOAF. For example, on such a profile page, XFN is used to markup the multiple rel-me identities and FOAF (in a separate file) is used to list all his friends. However in other profile pages, FOAF is skipped altogether. It doesn’t appear that there is a best practice published on how to mix and match the various technology.

Another issue I ran into is parsing and displaying human readable names for XFN URLs. As it stands, XFN allows one to define relationship between oneself and friends all centered around the URLs. However, URLs are not designed for optimal human readability, some URLs are long and unruly and others employ the use of proprietary internal naming scheme, e.g. (actual site names changed to protect the innocent),

The reason why I think it is important to couple human readable names with URLs is that a consuming app usually wants to do something meaningful with the XFN information and URLs alone does not provide complete information resulting in the end user having to do more work filling in the human readable information after the initial extraction.

In my discussion with Kevin Marks, he indicated that hCard can and should be used along with XFN to provide complete information. For example, it is possible to have the following XFN and hCard markup

<li class=”vcard”><a class=”fn url” href=”http://joeblowblog.com&#8221; rel=”met colleague friend” >Joe Blow</a></li>
<li class=”vcard”><a class=”fn url” href=”http://janedoeblog.com&#8221; rel=”met colleague friend”>Jane Doe</a></li>

I think this is a best practice that is not obvious. Developers are generally familiar with each type of microformats standard but I haven’t seen much documentation in way of how to mix and match the various standards for optimal use. Each standard tends to be describe in silo without consideration for other standards, so hopefully revelations like this can help developers better understand how to use the standards.

Even though the XFN/hCard combination is more complete than just XFN, I still see some issues with it. For example, a parser has to understand the implied relationship between the hCard information and the XFN information and returns that information as a related entity meaning that hCard provides the human readable names for the XFN URL, a relationship that is currently not part of the hCard or XFN spec, so it has to be inferred by the developer. Also, I would like this type of cross standards best practices to also extend to XFN/FOAF, etc. Note that at this time, Google Social Graph APIs do not parse hCard information so even if someone put that information on their profile page, it won’t be useful if the consuming app uses Google Social Graph API. Kevin indicates that he might rectify this in the future and extends the API to also parse hCard.

One last thought, even though I started my application using Javascript, if I want to do more useful stuff, I would switch over to server side code. In particular, if I need to store persistent user information, I need a database and that’s best facilitated by server side platform.

Feedback and suggestions are welcomed.

Update
Chris pointed me to a blog post he did on XFN, Portable contact lists and the case against XFN, it’s worth a read IMO.