The DNA sequence is composed of a series of nucleotides abbreviated as 'A'
, 'C'
, 'G'
, and 'T'
.
- For example,
"ACGAATTCCG"
is a DNA sequence.
When studying DNA, it is useful to identify repeated sequences within the DNA.
Given a string s
that represents a DNA sequence, return all the 10
-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.
Example 1:
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT" Output: ["AAAAACCCCC","CCCCCAAAAA"]
Example 2:
Input: s = "AAAAAAAAAAAAA" Output: ["AAAAAAAAAA"]
Constraints:
1 <= s.length <= 105
s[i]
is either'A'
,'C'
,'G'
, or'T'
.
Solution: Hashtable
Store each subsequence into the hashtable, add it into the answer array when it appears for the second time.
Time complexity: O(n*l)
Space complexity: O(n*l) -> O(n) / string_view
C++
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
// Author: Huahua class Solution { public: vector<string> findRepeatedDnaSequences(string_view s) { constexpr int kLen = 10; const int n = s.length(); unordered_map<string_view, int> m; vector<string> ans; for (int i = 0; i + kLen <= n; ++i) if (++m[s.substr(i, kLen)] == 2) ans.emplace_back(s.substr(i, kLen)); return ans; } }; |
Optimization
There are 4 type of letters, each can be encoded into 2 bits. We can represent the 10-letter-long string using 20 lowest bit of a int32. We can use int as key for the hashtable.
A -> 00
C -> 01
G -> 10
T -> 11
Time complexity: O(n)
Space complexity: O(n)
C++
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
// Author: Huahua class Solution { public: vector<string> findRepeatedDnaSequences(string s) { constexpr int kLen = 10; constexpr int mask = (1 << (2 * kLen)) -1; const int n = s.length(); unordered_map<int, int> m; array<int, 128> km; km['A'] = 0; km['C'] = 1; km['G'] = 2; km['T'] = 3; vector<string> ans; for(int i = 0, key = 0; i < n; ++i) { key = ((key << 2) & mask) | km[s[i]]; if (i < kLen - 1) continue; if (++m[key] == 2) ans.push_back(s.substr(i - kLen + 1, kLen)); } return ans; } }; |
请尊重作者的劳动成果,转载请注明出处!花花保留对文章/视频的所有权利。
如果您喜欢这篇文章/视频,欢迎您捐赠花花。
If you like my articles / videos, donations are welcome.
Be First to Comment