NOTE: We are working on migrating this site away from MediaWiki, so editing pages will be disabled for now.

Difference between revisions of "BioMart Tutorial"

From GMOD
Jump to: navigation, search
m (Testing MartView)
m (Removing #icon)
 
(14 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{UnderConstruction}}
+
[[File:Biomart250.png|center|link=http://www.biomart.org/|BioMart]]
  
{| class="tutorialheader"
+
Please visit [http://biomart.org the BioMart website] for all information on BioMart.
| align="right" | {{#icon: Biomart250.png|BioMart|200|BioMart}}<br /><br />{{#icon: GMOD2009Europe170.png|2009 GMOD Summer School - Europe||2009 GMOD Summer School - Europe}}
+
| {{TutorialTitleLine|[[BioMart]]}}<br />
+
2009 [[GMOD Summer School]] - [[2009 GMOD Summer School - Europe|Europe]] & [[2009 GMOD Summer School - Americas|Americas]]<br />
+
July & August 2009<br />
+
[[User:Junjun|Junjun Zhang]]
+
|}
+
__NOTITLE__
+
  
 
+
[[Category:BioMart]]
This [[:Category:Tutorials|tutorial]] walks you through how to install and configure a local installation of [[BioMart]].  This tutorial was originally taught by [[User:Junjun|Junjun Zhang]] at the 2009 [[GMOD Summer School]] - [[2009 GMOD Summer School - Europe|Europe]] & [[2009 GMOD Summer School - Americas|Americas]].  The notes and VMware image used on this page are from the Europe course.
+
[[Category:Tutorials]]
 
+
 
+
__TOC__
+
 
+
 
+
== VMware ==
+
{|
+
| valign="top" |This tutorial was taught using a [[VMware]] system image as a starting point.  If you want to start with that same system, download and install the ''Starting'' image.
+
 
+
'''''See [[VMware]] for what software you need to use a VMware system image, and for directions on how to get the image setup and running on your machine.'''''
+
|
+
{| style="margin-left: 1em; margin-top: 0; " class="wikitable"
+
! Download
+
|-
+
| align="center" | [ftp://ftp.gmod.org/pub/gmod/Courses/2009/SummerSchoolEurope/GmodSumSch2009EU3.tar.gz Starting&nbsp;Image]<br>
+
[ftp://ftp.gmod.org/pub/gmod/Courses/2009/SummerSchoolEurope/GmodSumSch2009EU3-5.tar.gz Ending Image]<br />
+
----
+
Username:&nbsp;gmod<br />Password: gmod
+
|}
+
|}
+
 
+
== Caveats ==
+
{{TutorialCaveats}}
+
__TOC__
+
 
+
 
+
==Introduction==
+
 
+
[[BioMart]] is a query-oriented data management and integration system. The system uses a generic data model for data integration and storage; it can be used for any type of data and is particularly suited for complex descriptive biological data. BioMart provides several interfaces for building/executing complex queries, such as, human-friendly web-based GUI, and program-friendly API and web services.
+
 
+
===Explore over 20 public databases through BioMart Central Portal===
+
 
+
BioMart Central Portal (http://www.biomart.org) provides a unified interface for querying over 20 public databases with a large variety of contents.
+
 
+
[[Image:PoweredByBioMart.png | border ]]
+
 
+
 
+
This section is intended to give you some basic ideas how [[BioMart]] helps biologists in searching data of their interests through BioMart intuitive web based GUI – '''[http://www.biomart.org/biomart/martview MartView]'''.
+
 
+
[[Image:MartViewGUI.png | border ]]
+
 
+
 
+
'''Sample queries (from http://www.biomart.org/biomart/martview):'''
+
<div class="emphasisbox">
+
# Retrieve Ensembl Gene ID, Chromosome Name, Gene Start (bp), Gene End (bp) of all human genes from ensembl mart ([http://www.biomart.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=hsapiens_gene_ensembl.default.feature_page.ensembl_gene_id|hsapiens_gene_ensembl.default.feature_page.chromosome_name|hsapiens_gene_ensembl.default.feature_page.start_position|hsapiens_gene_ensembl.default.feature_page.end_position&FILTERS=&VISIBLEPANEL=resultspanel bookmark])
+
# Restrict the results of the previous query to region of chromosome:1, Gene Start (bp):1 and Gene End (bp):100000
+
# Retrieve 300bp upstream flanking sequence for Ensembl Gene: ENSG00000000419, ENSG00000000457
+
# How do I convert IDs? I have the following Ensembl Gene IDs from human dataset: ENSG00000000419, ENSG00000000457 and I would like HGNC symbols and RefSeq DNA IDs along with matching Affymetrix platform HG U133-PLUS-2 probes
+
# (Two datasets query) How do I retrieve all mouse homologues for human genes?
+
# (Two datasets query) Restrict the results of the previous query to human genes on chromosome 1 and mouse orthologs on chromosome 2
+
# (Two datasets query) Retrieve all human Ensembl Genes (output Gene ID and [http://www.genenames.org/ HGNC] symbol) that are involved in a pathway with a [http://reactome.org Reactome] pathway stable ID: REACT_1698 (output pathway stable ID and pathway name) ([http://www.biomart.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=hsapiens_gene_ensembl.default.feature_page.ensembl_gene_id|pathway.default.feature_page.stableidentifier_identifier|pathway.default.feature_page._displayname&FILTERS=pathway.default.filters.pathway_id_list.%22REACT_1698%22&VISIBLEPANEL=resultspanel bookmark])
+
</div>
+
 
+
==System overview and installation==
+
 
+
===What tools are included in BioMart?===
+
 
+
*Building Mart: '''MartBuilder''' and '''MartRunner'''
+
*Configuring Mart: '''MartEditor'''
+
*Querying Mart: '''Perl API''', '''Java API''', '''MartView''' (web GUI, based on Perl API), '''MartService''' (web service interface, based on Perl API), '''MartExplorer''' (based on Java API), '''MartShell''' (based on Java API)
+
 
+
[[Image:WhatInBioMart.png | border ]]
+
 
+
===System installation===
+
 
+
====Installing biomart-perl====
+
<div class="emphasisbox">
+
Current release (0.7) of biomart-perl source code is available from [[Glossary#CVS|CVS]] (password: CVSUSER):
+
cvs -d :pserver:cvsuser@cvs.sanger.ac.uk:/cvsroot/biomart login
+
cvs -d :pserver:cvsuser@cvs.sanger.ac.uk:/cvsroot/biomart co -r release-0_7 biomart-perl
+
 
+
For this tutorial, we will use the biomart-perl source code from [[Glossary#SVN|SVN]] main trunk (below).
+
</div>
+
 
+
Biomart-perl source code is available from SVN:
+
svn co <nowiki>https://code.oicr.on.ca/svn/biomart/biomart-perl/trunk</nowiki> biomart-perl
+
 
+
The svn checkout above has already been done in the [[#VMware|VMware image]] at <tt>/home/gmod/software/biomart/biomart-perl</tt>.
+
 
+
Update your local copy of the source code:
+
cd /home/gmod/software/biomart/biomart-perl
+
svn update
+
 
+
Prerequisites for biomart-perl
+
*You need to have perl version 5.6.0 or later installed first.
+
*biomart-perl depends on a number of perl modules, a complete list of dependencies gets listed when you run the configure script.
+
*You need to have apache web server and mod_perl installed.
+
*You will also need one database server installed. BioMart currently supports three [[Glossary#RDBMS|RDBMSs]]: [[MySQL]], [[PostgreSQL]] and Oracle.
+
 
+
Intentionally, we have left the following Perl modules for you to install:
+
Number::Format
+
OLE::Storage_Lite
+
Test::Exception
+
Template::Plugin::Number::Format
+
 
+
Using apt-get:
+
sudo apt-get update
+
sudo apt-get install libnumber-format-perl
+
sudo apt-get install libole-storage-lite-perl
+
sudo apt-get install libtest-exception-perl
+
 
+
Using CPAN:
+
sudo cpan Template::Plugin::Number::Format
+
 
+
====Installing martj====
+
 
+
<div class="emphasisbox">The wget has already been done on the [[#VMware|VMware image]].</div>
+
Martj binary can be obtained as following:
+
cd /home/gmod/software/biomart/
+
wget <nowiki>ftp://anonymous@ftp.ebi.ac.uk/pub/software/biomart/martj_current/martj-bin.tgz</nowiki>
+
tar -zxf martj-bin.tgz
+
 
+
After this a folder named <tt>martj-0.7</tt> will be created under <tt>/home/gmod/software/biomart/</tt>
+
 
+
Prerequisites for martj
+
*Java 1.5 or later.
+
 
+
Java based tools can be launched by invoking corresponding scripts under <tt>bin</tt> directory, use <tt>*.bat</tt> for Windows, <tt>*.sh</tt> for Mac and Linux. For example, in the VMware image we can launch MartEditor as:
+
cd /home/gmod/software/biomart/martj-0.7
+
./bin/marteditor.sh
+
 
+
==Build your first Mart, configure and deploy BioMart Server==
+
 
+
The process of deploying a [[BioMart]] Server can be logically divided into two steps: transformation and configuration. The process of transforming an existing data source into a mart database can be carried out using MartBuilder, or a user-written data convertor. The configuration; defining a view ('''Attributes''' and '''Filters''') or multiple views on your data, is done by using '''MartEditor''' followed by a perl <tt>configure.pl</tt> script.
+
 
+
Workflow of creating, configuring and deploying a BioMart Server:
+
 
+
[[Image:CreateConfigMart.png | border]]
+
 
+
===What is a Data Mart?===
+
 
+
A '''mart''' is a collection of datasets. It is nearly always synonymous with a database in [[MySQL]], or a [[Glossary#Schema|schema]] in Oracle and [[Postgres]].
+
 
+
A '''dataset''' is a collection of tables that follow a given naming convention. The table naming convention is '''dataset__content__type''', where '''dataset''' is the name of the dataset, '''content''' is a free-text summary of the contents of the table, and '''type''' is either '''main''' (for main tables) or '''dm''' (for dimension tables).
+
 
+
Each dataset must have at least one single central table called the main table, with a type of '''main'''. This main table is involved in all queries, and will normally contain the information most frequently requested. It must have one column ending in the suffix '''_key''' which contains a unique identifier for each row, similar in function to a primary key.
+
 
+
A dataset may optionally have a number of dimension tables containing satellite information related to the main table. These dimension tables are recognized by having a type of '''dm'''. Each dimension table must have a column that contains values from the '''_key''' column of the main table to which the data in the dimension table is related, similar in function to a foreign key.
+
 
+
A dataset with a single main table and a number of dimensions looks something like this:
+
 
+
[[Image:MartModel.png | border ]]
+
 
+
In the example above, dataset name is '''mydemo''', it contains one main table and four dimension tables.
+
 
+
The set of all columns from all tables in a dataset is equivalent to the set of '''Attributes''' available on that dataset. Every '''Filter''' in a dataset is created by restricting an attribute to a particular value or range of values. Therefore filters are like the where-clause in [[Glossary#SQL|SQL]] statements and attributes are like the columns listed in the select portion of a SQL statement.
+
 
+
One key feature of such model is its simplicity. With many fewer tables to join, the goal of high performance query is achieved. Such design is originated from the '''star schema''' in industry data warehouse. The difference is that the relation of main and dm tables is 1:n in BioMart model while it is n:1 in star schema. For that reason, the BioMart model is often referred as '''reversed star'''. What’s common is that, dimension tables (so as main table in BioMart model) are highly '''denormalized''', ''i.e.'', related tables are merged to one table when certain rules are met. In the resulting table, values in many columns can be highly redundant. Denormalized table is also known as '''materialized view''' where join of all tables has been done and result is stored physically on the file system. Up to now, you should have realized that, the whole thing is a '''space-time''' trade-off game!
+
 
+
===Creating your own Mart: create/load sample mart===
+
 
+
Download demo data:
+
cd /home/gmod/software/biomart
+
rm my_mart.tar.gz
+
wget <nowiki>http://www.biomart.org/mart_demo.tar.gz</nowiki>
+
tar -zxf mart_demo.tar.gz
+
 
+
Load data into mart:
+
cd data
+
mysql -uroot -e 'grant all on *.* to gmod@localhost identified by "gmod"'
+
mysql -ugmod -pgmod -e 'create database my_mart'
+
mysql -ugmod -pgmod my_mart < my_mart.sql
+
 
+
===Configuring your Mart===
+
Start MartEditor by issuing the following command under <tt>martj-0.7</tt> folder:
+
 
+
cd /home/gmod/software/biomart/martj-0.7
+
./bin/marteditor.sh
+
 
+
Please ignore if you get JDBC driver warning message.
+
 
+
Below lists the main menu for MartEditor:
+
 
+
[[Image:MartEditorMenu.png | border]]
+
 
+
Now connect to the mart we just created, File &rarr; Database Connection, and input connection parameters as shown below:
+
 
+
[[Image:ConnectDB.png | border]]
+
 
+
Password is ''gmod''.
+
 
+
File &rarr; Naïve, then choose dataset: mydemo
+
 
+
This will create a naïve configuration of the newly created dataset. For now we will just use this configuration to continue the process of setting up BioMart Web Server. Later, we will go back to MartEditor to make some adjustments and add some more stuff.
+
 
+
[[Image:MartEditorPanal.png | border]]
+
 
+
Finally, File &rarr; Export, which will save the configuration back to the meta tables in the mart we created: my_mart.
+
 
+
===Setting the Registry===
+
 
+
The registry file refers to the connection parameters to the data sources (''i.e.'', marts) you would like to include. This could be your own database (mart) or a publicly available mart. Several example registry files (<tt>*.xml</tt>) are available under the directory:
+
/home/gmod/software/biomart/biomart-perl/conf/
+
 
+
Here is the registry for the mart we just created:
+
<xml><?xml version="1.0" encoding="UTF-8"?>
+
<!DOCTYPE MartRegistry>
+
<MartRegistry>
+
  <MartDBLocation
+
          name        = "my_mart"
+
          displayName  = "My BioMart Database"
+
          databaseType = "mysql"
+
          host        = "localhost"
+
          port        = "3306"
+
          database    = "my_mart"
+
          schema      = "my_mart"
+
          user        = "gmod"
+
          password    = "gmod"
+
          visible      = "1"
+
          default      = ""
+
          includeDatasets = ""
+
          martUser    = ""
+
  />
+
</MartRegistry></xml>
+
 
+
<div class="emphasisbox">Please be careful with copy-and-paste XML, make sure your XML is well-formed. Particularly, don't leave any white spaces or empty lines before <tt><?xml version="1.0" encoding="UTF-8"?></tt></div>
+
 
+
We save it in <tt>my_mart.xml</tt> under <tt>biomart-perl/conf</tt> folder:
+
cd /home/gmod/software/biomart/biomart-perl/conf
+
xedit my_mart.xml
+
{{TextEditorLink|xedit}}
+
 
+
===Setting Web Server Configuration===
+
 
+
biomart-perl creates a custom apache web server configuration file (<tt>httpd.conf</tt>) under <tt>biomart-perl/conf</tt> which is later used to start apache web server. What goes into this file is totally dynamic and automated. However, deployers are expected to set the path to apache binary, host name, port and apxs in <tt>biomart-perl/conf/settings.conf</tt>. The settings specified in this file are used by configure step explained in the next section.
+
 
+
Open <tt>conf/settings.conf</tt> with [[Linux Text Editors|xedit]] and set the following:
+
apacheBinary=/usr/sbin/apache2
+
serverHost=localhost
+
port=9002
+
apxs=/usr/bin/apxs2
+
 
+
===Run Configure Script===
+
 
+
From the <tt>biomart-perl</tt> directory, type:
+
cd ~/software/biomart/biomart-perl
+
perl bin/configure.pl -r conf/my_mart.xml --clean
+
 
+
It will ask:
+
Do you want to install in API only mode [y/n] [n]:
+
Type '''n''' and hit '''Enter'''.
+
 
+
===Starting and stopping Web Server===
+
 
+
From the biomart-perl directory,
+
to start the apache server, type:
+
/usr/sbin/apache2 -d $PWD -f $PWD/conf/httpd.conf
+
 
+
to stop the apache server, type:
+
kill `cat logs/httpd.pid`
+
<div style="font-size: 80%">Note: Those are backquotes, ''not'' standard single quotes, around the <tt>cat</tt> command.  This detail matters.  Backquotes invoke [http://freeengineer.org/learnUNIXin10minutes.html#CommandSubst Unix command substituion].</div>
+
 
+
===Testing MartView===
+
 
+
Now, point your web browser to:
+
: http://localhost:9002/biomart/martview
+
and see if the installation went fine. Note: replace localhost with the IP address of your [[#VMware|VM]] if you run web browser from your laptop's [[Glossary#OS|OS]].
+
 
+
===More exercises with MartEditor===
+
 
+
====Create two new FilterCollections: Chromosome and Gene Type====
+
 
+
[[Image:MartEditorContextMenu.png | border]]
+
 
+
Context Menu can be access by mouse right clicking any nodes in the Tree Panel. To insert a new FilterCollection, right click '''FilterGroup''' you wish to add to.
+
 
+
Do the following steps:
+
*insert a nwe FilterCollection make displayName to be '''Gene Type'''
+
*cut-n-paste biotype_1020 Filter to '''Gene Type''' FilterCollection
+
*insert a new FilterCollection, change its displayName to '''Chromosome'''
+
*drag-n-drop chromosome_name_1059 Filter to '''Chromosome''' FilterCollection
+
 
+
We can also modify some default values used in the naive configuration:
+
*change displayName of attribute:stable_id_1023 to '''Ensembl Gene ID'''
+
*set '''default''' to '''true''' for attribute:stable_id_1023
+
*change displayName of attribute:gene_symbol_1074 to '''Gene symbol'''
+
*set '''default''' to '''true''' for attribute:gene_symbol_1074
+
 
+
Don't forget to '''Export''' your new configure from MartEditor.
+
 
+
Now stop apache server, re-run configure.pl, and start apache server again. Make sure you are in  /home/gmod/software/biomart/biomart-perl, then do the following:
+
kill `cat logs/httpd.pid`
+
perl bin/configure.pl -r conf/my_mart.xml --clean
+
/usr/sbin/apache2 -d $PWD -f $PWD/conf/httpd.conf
+
 
+
We will need to do this a few times more, so it's better to put the commands in a shell script:
+
cd /home/gmod/software/biomart/biomart-perl
+
xedit restart.sh
+
 
+
Copy and paste, then save.
+
 
+
Make it executable by everyone:
+
chmod +x restart.sh
+
 
+
Next time we need to reconfig the server, we do:
+
cd /home/gmod/software/biomart/biomart-perl
+
./restart.sh
+
 
+
 
+
Go to http://localhost:9002/biomart/martview to check out the new FilterCollections we just created.
+
 
+
====Make a dropdown list for Chromosome name Filter====
+
 
+
Right-click '''Chromosome name''' Filter, from the Context Menu choose '''make drop down''', you are done!
+
 
+
If you want to allow multiple options to be selected in this drop down list, simply set '''multipleValues''' to '''1''', export configuration, and reconfigure MartView.
+
 
+
'''Export''' new configure.
+
 
+
Now stop apache server, re-run configure.pl, and start apache server again.
+
 
+
cd /home/gmod/software/biomart/biomart-perl/
+
./restart.sh
+
 
+
Go to http://localhost:9002/biomart/martview to check out the change for Chromosome name filter.
+
 
+
====Configure links between datasets (ie, federation)====
+
 
+
In BioMart, a '''link''' is built through a pair of '''Exportable''' and '''Importable''', each defined in one of the two to-be-linked datasets.
+
 
+
We can think of an Exportable is an Attribute (or an Attribute list) which one dataset exports to the other dataset to fetch related data records. Similarly, an Importable can be seen as a Filter, one dataset takes Exportable from the other dataset and apply it to its own Filter.
+
 
+
Let's look at an example: the mydemo dataset can be linked with hsapiens_gene_ensembl in Enseml Gene mart by the common Ensembl Gene ID field. We can define an Exportable in hsapiens_gene_ensembl, and an Importable in mydemo.
+
 
+
hsapiens_gene_ensembl already has an Exportable defined, see below:
+
 
+
[[image:MartEditorExportable.png | border]]
+
 
+
Useful tip: you can always connect to Ensembl Mart with MartEditor to learn how Filters and Attributes are defined.
+
 
+
{| class="wikitable"
+
! colspan=2 | Ensembl Mart MySQL connection parameters
+
|-
+
! Host
+
| martdb.ensembl.org
+
|-
+
! Port
+
| 5316
+
|-
+
!  User
+
| anonymous
+
|-
+
!  Databases
+
| ensembl_mart_55
+
|}
+
 
+
Now let's create an Importable for mydemo dataset.
+
 
+
[[Image:MartEditorCreateImportable.png | border]]
+
 
+
The Importable should look like this:
+
 
+
[[Image:MartEditorImportable.png | border]]
+
 
+
<div class="emphasisbox">Important note: only Exportable and Importable with exactly matched linkName will be linked.</div>
+
 
+
Don't forget to '''Export''' your configuration to mart: File &rarr; Export
+
 
+
Now, we have to add hsapiens_gene_ensembl dataset in the registry, together with mydemo.
+
 
+
Here is what the new registry looks like:
+
<xml>
+
<?xml version="1.0" encoding="UTF-8"?>
+
<!DOCTYPE MartRegistry>
+
<MartRegistry>
+
  <MartDBLocation
+
          name        = "my_mart"
+
          displayName  = "My BioMart Database"
+
          databaseType = "mysql"
+
          host        = "localhost"
+
          port        = "3306"
+
          database    = "my_mart"
+
          schema      = "my_mart"
+
          user        = "gmod"
+
          password    = "gmod"
+
          visible      = "1"
+
          default      = ""
+
          includeDatasets = ""
+
          martUser    = ""
+
  />
+
  <MartDBLocation
+
          name        = "ensembl_gene"
+
          displayName  = "Ensembl Gene"
+
          databaseType = "mysql"
+
          host        = "martdb.ensembl.org"
+
          port        = "5316"
+
          database    = "ensembl_mart_55"
+
          schema      = "ensembl_mart_55"
+
          user        = "anonymous"
+
          password    = ""
+
          visible      = "1"
+
          default      = ""
+
          includeDatasets = "hsapiens_gene_ensembl"
+
          martUser    = ""
+
  />
+
</MartRegistry>
+
</xml>
+
Now stop apache server, re-run <tt>configure.pl</tt>, and start apache server again.
+
 
+
cd /home/gmod/software/biomart/biomart-perl/
+
./restart.sh
+
 
+
Finally, you can test queries against federated datasets at http://localhost:9002/biomart/martview.
+
 
+
[[image:MartViewJoinQuery.png | border]]
+
 
+
==Access BioMart Server via program-friendly interfaces: API and MartService==
+
 
+
===Perl API===
+
 
+
After set a query in MartView, you can click the '''Perl''' button (top right corner), you will get a piece of automatically generated Perl code. With few simple modifications, you can run the code to query dataset through Perl API. Here is a [http://localhost:9002/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=mydemo.default.naive_attributes.stable_id_1023|mydemo.default.naive_attributes.gene_symbol_1074|mydemo.default.naive_attributes.source_1018|mydemo.default.naive_attributes.chromosome_name_1059|mydemo.default.naive_attributes.seq_region_start_1020|mydemo.default.naive_attributes.seq_region_end_1020&FILTERS=mydemo.default.naive_filters.chromosome_name_1059.%221%22&VISIBLEPANEL=resultspanel sample query]
+
 
+
Let's copy and paste the perl code in xedit, save the code under /home/gmod/software/biomart/biomart-perl/scripts.
+
cd /home/gmod/software/biomart/biomart-perl/scripts
+
xedit myApiTest.pl
+
 
+
Add this line to include Perl libraries at the top of the code:
+
<perl>use lib '/home/gmod/software/biomart/biomart-perl/lib';</perl>
+
 
+
Modify this line to set the correct registry file:
+
<perl>my $confFile = '/home/gmod/software/biomart/biomart-perl/conf/my_mart.xml';</perl>
+
 
+
Run it as:
+
perl myApiTest.pl
+
 
+
===MartService===
+
 
+
MartService provides a program-friendly interface for end-users and third-party tools to interact with a BioMart Server. There are a few systems (eg. Taverna, Galaxy and biomaRt R package) have implemented plugins based on MartService.
+
 
+
====Get Results====
+
 
+
The following request is used to retrieve data from a BioMart database. An XML based query containing '''attributes''', '''filters''' and '''datasets''' is POSTED to a target BioMart web server which returns either results or number of entries based on the request.
+
 
+
localhost:9002/biomart/martservice?query=<QUERY_XML>
+
 
+
A '''Query XML''' example:
+
<xml>
+
<?xml version="1.0" encoding="UTF-8"?>
+
<!DOCTYPE Query>
+
<Query  virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
+
  <Dataset name = "mydemo" interface = "default" >
+
  <Filter name = "chromosome_name_1059" value = "1"/>
+
  <Attribute name = "stable_id_1023" />
+
  <Attribute name = "gene_symbol_1074" />
+
  <Attribute name = "chromosome_name_1059" />
+
  <Attribute name = "seq_region_start_1020" />
+
  <Attribute name = "seq_region_end_1020" />
+
  <Attribute name = "source_1018" />
+
  </Dataset>
+
</Query>
+
</xml>
+
 
+
'''Useful tip:'''
+
*To retrieve an XML Query from any BioMart Web interface (MartView), hit the '''XML''' button after making your selection of database, datasets, attributes and filters
+
 
+
Save the above query XML in query.xml and put it under /home/gmod/software/biomart/biomart-perl/scripts.
+
 
+
cd /home/gmod/software/biomart/biomart-perl/scripts
+
xedit query.xml
+
 
+
Edit webExample.pl so that path points to your own server:
+
 
+
<nowiki>my $path="http://localhost:9002/biomart/martservice?";</nowiki>
+
 
+
Now run:
+
perl webExample.pl query.xml
+
 
+
====Get Metadata====
+
 
+
The requests described in this section are used to retrieve which '''marts''', '''datasets''', '''attributes''', '''filters''' and '''formatters''' are available on a particular BioMart web server.
+
 
+
{| class="wikitable"
+
! Get Marts
+
| http://localhost:9002/biomart/martservice?type=registry
+
|-
+
! Get Datasets
+
| http://localhost:9002/biomart/martservice?type=datasets&mart=my_mart
+
|-
+
! Get Attributes
+
| http://localhost:9002/biomart/martservice?type=attributes&dataset=mydemo
+
|-
+
!Get Filters
+
| http://localhost:9002/biomart/martservice?type=filters&dataset=mydemo
+
|}
+
 
+
==Demo: create data mart using MartBuilder==
+
 
+
===Prepare source data===
+
 
+
cd /home/gmod/software/biomart/data
+
mysql -ugmod -pgmod -e 'create database student'
+
mysql -ugmod -pgmod student < student.sql
+
 
+
===Create student_mart===
+
 
+
We now start MartBuilder:
+
cd /home/gmod/software/biomart/martj-0.7
+
./bin/martbuilder.sh
+
 
+
First add the source schema, '''Schema''' &rarr; '''Add'''
+
 
+
[[Image:MBuilderMenu.png | border]]
+
 
+
Here, please input connection parameters:
+
 
+
[[Image:MBuilderAddSchema.png | border]]
+
 
+
Now you should be able to see the student schema:
+
 
+
[[Image:MBuilderSchemaView.PNG | border]]
+
 
+
Right-click on student table, then choose create dataset for student:
+
 
+
[[Image:MBuilderDatasetView.png | border]]
+
 
+
We are now going to transform the source data into target dataset, but before that, we have to create a target database:
+
 
+
mysql -ugmod -pgmod -e 'create database student_mart'
+
 
+
Also we have to have '''MartRunner''' running. Let's run it over port 8888:
+
cd /home/gmod/software/biomart/martj-0.7
+
./bin/martrunner.sh 8888
+
 
+
MartBuilder will send the '''transformation SQL''' to MartRunner through port 8888, and MartRunner will execute the '''transformation SQL'''. Usually, MartBuilder and MartRunner run on different machines.
+
 
+
 
+
We go back to MartBuilder, clike '''Build Mart''':
+
 
+
[[Image:MBuilderRunner.png | border]]
+
 
+
The MartRunner monitor window will show up as below. Click '''Start job''' to build student_mart.
+
 
+
[[Image:MBuilderRunnerMonitor.png | border]]
+
 
+
===Configure and deploy student_mart===
+
 
+
*Start MartEditor; connect to student_mart; '''Naive'''; '''Export'''
+
 
+
*Add one more MartDBLocation entry in my_mart.xml (under <tt>/home/gmod/software/biomart/biomart-perl/conf/</tt>) pointing to student_mart database
+
<xml>
+
  <MartDBLocation
+
        name        = "student_mart"
+
        displayName  = "My Student Database"
+
        databaseType = "mysql"
+
        host        = "localhost"
+
        port        = "3306"
+
        database    = "student_mart"
+
        schema      = "student_mart"
+
        user        = "gmod"
+
        password    = "gmod"
+
        visible      = "1"
+
        default      = ""
+
        includeDatasets = ""
+
        martUser    = ""
+
  />
+
</xml>
+
 
+
*Restart your BioMart Server:
+
 
+
cd ~/software/biomart/biomart-perl/
+
./restart.sh
+
 
+
*Query student_mart using MartView
+
*: http://localhost:9002/biomart/martview
+
 
+
==Getting support==
+
 
+
*Further documentation from http://www.biomart.org
+
*Mailing list [mailto:mart-dev@ebi.ac.uk mart-dev@ebi.ac.uk]
+

Latest revision as of 21:41, 2 October 2012

BioMart

Please visit the BioMart website for all information on BioMart.