| 1. | The arb software is a graphically oriented package comprising various tools for sequence database handling and data analysis 这个软件的目的是处理各种序列数据库(导入、导出、格式化等)和数据分析,在数据库处理上,功能强大。 |
| 2. | Because sequence data increase rapidly in biology sequence database , it is very exigent to develop algorithms that have high biology sensitivity and efficiency 随着生物序列数据库中序列数据的激增,开发兼有高度生物敏感性和高效率的算法就显得非常迫切。 |
| 3. | The experiments show that this method can find the similar pattern subsequences , which are possibly quite different in the value or scale of the segments . 5 we present the similarity - based time series clustering method 该方法能够对不同取值范围、不同长度的子序列进行有效的查询,实现了对序列数据库在不同划分粒度下的模式查询。 |
| 4. | Data mining is a technology to find the unknown , hidden and interesting knowledge from the massive data . time series data mining includes trend analysis , periodic pattern mining , sequential pattern mining and similarity search 把数据挖掘技术应用于时间序列数据库能够发现时间序列数据库中所蕴涵的模式,进而扩展时间序列数据库的查询能力。 |
| 5. | In the method , the remarkable points of the series are selected as the end points of the segments , and the number of the remarkable points can be controlled by the parameter ? remarkable duration . this method is robust and consistent 该方法以指定区间内的显著点作为子段的端点,通过指定显著性区间的大小,用户能够直观地控制序列的划分粒度,实现对序列数据库相同粒度的子段划分。 |
| 6. | Because of the growing of time - series database and the potential significance of data mining , the research of data mining in time - series database has become a hotspot . at the same time , however , the nonlinear and chaotic characteristic of time - series data makes the mining be a difficult issue 由于时间序列数据库的日趋庞大及其挖掘的潜在意义,目前,时序数据挖掘研究已成为一个热点;然而,时间序列数据的非线性混沌特点,使得对它的挖掘成为难题。 |
| 7. | Although it can fulfill the queries and operations on the sequence data , it can not find the sequences or subsequences which have the same or approximately same pattern with the query sequences . it is necessary to extend the capability of the queries to find the hidden knowledge in the database 时间序列数据库描述、存储时间序列数据并提供各种查询操作,通常这些操作都是基于序列元素的值或者时间坐标进行的,无法实现对序列数据库中所蕴含知识的查询。 |
| 8. | General method of similar sequence mining based on time series is to transform time series into discrete character series and cluster them into different sets , then compute the euclidean distance between querying series and these sets to measure their similarity 摘要时间序列数据库中相似子序列的搜索,常用滑动窗口、分形插值逼近等方法将时间序列分割成各子序列,线性拟合各分段子序列,计算查询序列与各子序列的欧氏距离,满足距离阈值条件的为相似子序列。 |
| 9. | In order to improve the efficiency , we use the sampling points of the sequences to compute the distance of two sequences . the distance of sampling points is used to filter the sequence of the database , so the similarity searching space is reduced and the efficiency of the query is improved 在保持序列变化模式的前提下,使用抽样点来计算序列之间的dtw距离,并依据抽样比率和查询参数选择过滤距离对序列数据库进行过滤,实验结果表明,抽样过滤的方法明显提高了查询效率。 |
| 10. | With the arrival of the post - genome era and recent development of new , high - throughput technologies to mine data in biology , vast amounts of sequence data are flooding the dna and protein databases so rapidly that there is a strong need for efficient as well as effective computational tools to handle these data 随着后基因组时代的到来以及一些新的高通量信息提取技术的开发, dna和蛋白质序列数据库中的信息量以爆炸式增长。生命科学研究已经进入了深深依赖计算机和网络的时代。在所有的计算机工具中,多序列对齐占据核心地位。 |